Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: generic write() method #16

Closed
jmcnamara opened this issue Feb 10, 2023 · 28 comments
Closed

Proposal: generic write() method #16

jmcnamara opened this issue Feb 10, 2023 · 28 comments
Assignees
Labels
enhancement New feature or request

Comments

@jmcnamara
Copy link
Owner

jmcnamara commented Feb 10, 2023

It would be possible to have a generic write() method using a trait that could call the type specific write_*() method such as write_number(). This is a feature in the Python version of the library.

I'll leave this here as a reminder to myself and to see if it gets an +1s for usefulness or desirability.

@jmcnamara jmcnamara added the question This is a general question label Feb 10, 2023
@jmcnamara jmcnamara self-assigned this Feb 10, 2023
@jmcnamara jmcnamara added enhancement New feature or request and removed question This is a general question labels Feb 10, 2023
@moutikabdessabour
Copy link

Can I take a shot at this?

@jmcnamara
Copy link
Owner Author

jmcnamara commented Feb 12, 2023

@moutikabdessabour At this stage I'm mainly asking if people think this would be useful. Do you think it would be useful?

I'm currently 50/50 on this, for the reasons below, but I'm willing to take input.

Pros:

  • Makes writing code/examples easier.

Cons:

  • The similar write() method in the Python and Perl versions tend to generate bug reports when users try to write arbitrary sting or number like data. People also assume that writing something like "2003-01-01" should become a date in Excel.
  • There would be a slight performance overhead for strings since they would need to be checked to see if they are formulas or urls

@moutikabdessabour
Copy link

For the second con I think implementing a Formula/Url type would remove the overhead. I wrote an app using this library and while it is great the only issue I had is the verbosity of the function calls. I am a 100% for this feature

@jmcnamara
Copy link
Owner Author

jmcnamara commented Feb 14, 2023

I wrote an app using this library and while it is great the only issue I had is the verbosity of the function calls.

Yes, that is my feeling too. There are currently 24 write_*() methods. That is just too many.

It could be mapped down to 5 like this:

Current New
write_number() write()
write_string() write()
write_url() write()
write_boolean() write()
write_formula() write()
write_rich_string() write()
write_blank() write()
write_time() write()
write_date() write()
write_datetime() write()
write_dynamic_formula() write()
write_number_with_format() write_with_format()
write_string_with_format() write_with_format()
write_url_with_format() write_with_format()
write_boolean_with_format() write_with_format()
write_formula_with_format() write_with_format()
write_rich_string_with_format() write_with_format()
write_dynamic_formula_with_format() write_with_format()
write_array_formula() write_array_formula()
write_dynamic_array_formula() write_array_formula()
write_array_formula_with_format() write_array_formula_with_format()
write_dynamic_array_formula_with_format() write_array_formula_with_format()
write_url_with_options() Same
write_url_with_text() Same, or write_url_with_options()

With some caveats/additions:

  • It will need the addition of a new Url type.
  • It will need the addition of a new Formula type that can be set to array or dynamic array (if needed)
  • Default DateTime, Date and Time formats for datetime()/date()/time() variants without a user specified format since a date without a format in Excel is just a number.

Also a IntoExcelType (or similar) trait coupled with a generic write() would have the advantage of allowing mapping of other arbitrary user types in a clean way.

Now, that I've mapped it out like this it seems like the right thing to do (even if it is another major API break). To avoid breakage the type specific methods could stay but be undocumented/hidden. They would be needed for the trait mapping anyway.

I might add it after I finish the current work on charts.

@moutikabdessabour
Copy link

Gonna take a jab at this later today. WIll probably have something ready tomorrow night.

@jmcnamara
Copy link
Owner Author

Gonna take a jab at this later today. WIll probably have something ready tomorrow night.

There is no need. I will get to it myself after the current release.

@jmcnamara
Copy link
Owner Author

jmcnamara commented Feb 15, 2023

BTW that isn't meant to deter you from doing a prototype. I just want to flag that it will needs a lot of doc and example refactoring and I'd prefer to do that myself.

@GlennMm
Copy link

GlennMm commented Feb 24, 2023

i would love the feature as it would enable me for example to write one function to write json data in array format (structured differently) with having to write many functions for a specific struct to write json to excel. If there is any way i can archive this can please show me how to and also you dim this feature as unnecessary then can you link me a rs lib that enable me todo so if its available.

@jmcnamara
Copy link
Owner Author

@GlennMm At this point I'm definitely going to add it. Either in the next release or a separate release just after.

@GlennMm
Copy link

GlennMm commented Feb 27, 2023

@GlennMm At this point I'm definitely going to add it. Either in the next release or a separate release just after.

Thank you @jmcnamara that would awesome, looking forward to this feature

jmcnamara added a commit that referenced this issue Feb 27, 2023
Add support for a generica write() method and also the IntoExcelData
trait to allow users to map their own data types to Excel types.

Feature request #16
@jmcnamara
Copy link
Owner Author

jmcnamara commented Feb 27, 2023

I've added an initial version of this feature to main. It needs some additional docs and examples before it is ready for release.

It allows 2 new important features. The first is a generic write() method to simplify code:

use rust_xlsxwriter::{Workbook, XlsxError};

fn main() -> Result<(), XlsxError> {
    // Create a new Excel file object.
    let mut workbook = Workbook::new();

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    // Write a string to cell (0, 0) = A1.
    worksheet.write(0, 0, "Hello")?;

    // Write a number to cell (1, 0) = A2.
    worksheet.write(1, 0, 12345)?;

    // Save the file to disk.
    workbook.save("hello.xlsx")?;

    Ok(())
}

Output:
screenshot

The second feature is that you can now use the same mechanism to extend write() for your own data types:

//! Example of how to extend the the rust_xlsxwriter write() method using the
//! IntoExcelData trait to handle arbitrary user data that can be mapped to one
//! of the main Excel data types.

use rust_xlsxwriter::*;

fn main() -> Result<(), XlsxError> {
    // Create a new Excel file object.
    let mut workbook = Workbook::new();

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    // Make the first column wider for clarity.
    worksheet.set_column_width(0, 12)?;

    // Write user defined type instances that implement the IntoExcelData trait.
    worksheet.write(0, 0, UnixTime::new(0))?;
    worksheet.write(1, 0, UnixTime::new(946598400))?;
    worksheet.write(2, 0, UnixTime::new(1672531200))?;

    // Save the file to disk.
    workbook.save("write_generic.xlsx")?;

    Ok(())
}

// For this example we create a simple struct type to represent a Unix time.
// This is the number of elapsed seconds since the epoch of January 1970 (UTC).
// See https://en.wikipedia.org/wiki/Unix_time. This type isn't handled by
// default by rust_xlsxwriter.
pub struct UnixTime {
    seconds: u64,
}

impl UnixTime {
    pub fn new(seconds: u64) -> UnixTime {
        UnixTime { seconds }
    }
}

// Implement the IntoExcelData trait to map our new UnixTime struct into an
// Excel type.
//
// The relevant Excel type is f64 which is used to store dates and times (along
// with a number format). The Unix 1970 epoch equates to a date/number of
// 25569.0. For Unix times beyond that we divide by the number of seconds in the
// day (24 * 60 * 60) to get the Excel serial date.
//
// We must also supply a number format if one isn't specified.
//
impl IntoExcelData for UnixTime {
    fn write(
        self,
        worksheet: &mut Worksheet,
        row: RowNum,
        col: ColNum,
    ) -> Result<&mut Worksheet, XlsxError> {
        // Create a default date format.
        let format = Format::new().set_num_format("yyyy-mm-dd");

        // Convert the Unix time to an Excel datetime.
        let datetime = 25569.0 + (self.seconds / (24 * 60 * 60)) as f64;

        // Write the date as a number with a format.
        worksheet.write_number_with_format(row, col, datetime, &format)
    }

    fn write_with_format<'a>(
        self,
        worksheet: &'a mut Worksheet,
        row: RowNum,
        col: ColNum,
        format: &'a Format,
    ) -> Result<&'a mut Worksheet, XlsxError> {
        // Convert the Unix time to an Excel datetime.
        let datetime = 25569.0 + (self.seconds / (24 * 60 * 60)) as f64;

        // Write the date with the user supplied format.
        worksheet.write_number_with_format(row, col, datetime, format)
    }
}

Output:

screenshot

Comments/reviews welcome.

jmcnamara added a commit that referenced this issue Feb 27, 2023
Add support for a generica write() method and also the IntoExcelData
trait to allow users to map their own data types to Excel types.

Feature request #16
@jmcnamara
Copy link
Owner Author

jmcnamara commented Feb 28, 2023

Also, if the Format was treated as an Option it would be possible to have just one method that needs to be implemented for the trait:

impl IntoExcelData for UnixTime {
    fn write<'a>(
        self,
        worksheet: &'a mut Worksheet,
        row: RowNum,
        col: ColNum,
        format: Option<&'a Format>,
    ) -> Result<&'a mut Worksheet, XlsxError> {
        // Convert the Unix time to an Excel datetime.
        let datetime = 25569.0 + (self.seconds / (24 * 60 * 60)) as f64;

        match format {
            Some(format) => {
                // Write the date with the user supplied format.
                worksheet.write_number_with_format(row, col, datetime, format)
            }
            None => {
                // Create a default date format.
                let format = Format::new().set_num_format("yyyy-mm-dd");
                worksheet.write_number_with_format(row, col, datetime, &format)
            }
        }
    }
}

This may, or may not, be more confusing to the end user. Internally it works out a lot cleaner. This code is on the generic_with_option branch for comparison.

jmcnamara added a commit that referenced this issue Feb 28, 2023
Add support for a generica write() method and also the IntoExcelData
trait to allow users to map their own data types to Excel types.

Feature request #16
jmcnamara added a commit that referenced this issue Feb 28, 2023
@jmcnamara
Copy link
Owner Author

@Fight-For-Food Any thoughts on the interface or implementation of this feature? You had some good input on other generic/non-generic code.

@Fight-For-Food
Copy link
Contributor

@Fight-For-Food Any thoughts on the interface or implementation of this feature? You had some good input on other generic/non-generic code.

I think that
fn write(self, worksheet: &mut Worksheet, row: RowNum, col: ColNum) -> Result<&mut Worksheet, XlsxError>{....}
is ok. It is convenient to use. And there is no need to make in more complex with extra-argument format: Option<&'a Format>
For some more comprehensive thoughts i need more time. Maybe i will have some at this week. But can't make a promise about it.

@jmcnamara
Copy link
Owner Author

jmcnamara commented Feb 28, 2023

@Fight-For-Food

And there is no need to make in more complex with extra-argument format: Option<&'a Format>

It may not be clear from the snippet above but the addition of the Option<> means that there is only one trait funtion that needs to be implemented by the end user. However, it would probably be confusing and the version on main with IntoExcelData::write() and IntoExcelData::write_with_format() mimics the worksheet APIs may be simpler.

Maybe i will have some at this week. But can't make a promise about it.

Thanks. No problem if you can't get to it.

@GlennMm
Copy link

GlennMm commented Mar 8, 2023

thank you for implementing this, it really help make my work much easier. looking to forward to be start contributing to this repo. so have started using this feature i haven't seen any problems with it so far

@moutikabdessabour
Copy link

@jmcnamara Looks great, You could allow the users to define only one function but get access to both write and write_with_format, the trait definition will have three functions the write, write_with_format and an internal function that would actually implement the logic with the format arg passed in as an option.

trait IntoExcelData {
    fn internal_write<'a>(
        self,
        worksheet: &'a mut Worksheet,
        row: RowNum,
        col: ColNum,
        format: Option<&'a Format>,
    ) -> Result<&'a mut Worksheet, XlsxError>;
    fn write(
        self,
        worksheet: &mut Worksheet,
        row: RowNum,
        col: ColNum,
    ) -> Result<&mut Worksheet, XlsxError>{
        self.internal_write(&mut worksheet, row, column, None)
    }

    fn write_with_format<'a>(
        self,
        worksheet: &'a mut Worksheet,
        row: RowNum,
        col: ColNum,
        format: &'a Format,
    ) -> Result<&'a mut Worksheet, XlsxError>{
        self.internal_write(&mut worksheet, row, column, Some(format))
    }

}

@jmcnamara
Copy link
Owner Author

Just a heads up that the initial version of this is upstream in crates.io in version 0.27.0: https://crates.io/crates/rust_xlsxwriter

I'll iterate and improve on this in the next few releases.

@jmcnamara jmcnamara reopened this May 5, 2023
@jmcnamara
Copy link
Owner Author

Having taken a small detour into charts I've come back to complete this.

Some tasks, not in order:

  • Use impl Into<String> instead of &str in method parameters where possible/sensible.
  • Implement generic write_row() and write_column() methods.
  • Add types for Url and Formula and add them to generic write().

@jmcnamara
Copy link
Owner Author

jmcnamara commented May 6, 2023

Implement generic write_row() and write_column() methods.

I've added a first pass at this to main. You can now write arrays of data types that implement IntoExcelData like this:

use rust_xlsxwriter::{Workbook, XlsxError};

fn main() -> Result<(), XlsxError> {
    // Create a new Excel file object.
    let mut workbook = Workbook::new();

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    // Some array data to write.
    let numbers = [1, 2, 3, 4, 5];
    let words = ["Hello"; 5];
    let matrix = [
        [10, 11, 12, 13, 14],
        [20, 21, 22, 23, 24],
        [30, 31, 32, 33, 34],
    ];

    // Write the array data as columns.
    worksheet.write_column(0, 0, numbers)?;
    worksheet.write_column(0, 1, words)?;

    // Write the array data as rows.
    worksheet.write_row(0, 3, numbers)?;
    worksheet.write_row(1, 3, words)?;

    // Write the matrix data as an array or rows and as an array of columns.
    worksheet.write_row_matrix(3, 3, matrix)?;
    worksheet.write_column_matrix(6, 0, matrix)?;

    // Save the file to disk.
    workbook.save("arrays.xlsx")?;

    Ok(())
}

Output:

screenshot

It is just some syntactic sugar around a loop but it is somewhat popular in the Python version.

It currently has an additional deference/copy and I'm not sure if that can be avoided. Feedback welcome on that or anything else.

@adriandelgado
Copy link
Contributor

I gave some feedback on the last commit. I think that feedback applies on more parts of the codebase.

jmcnamara added a commit that referenced this issue May 6, 2023
@jmcnamara
Copy link
Owner Author

@adriandelgado

I think the Copy trait bound could be avoided by using the IntoIterator trait.

I initially tried, and failed, to get that working. :-( With my morning brain and 2 cups of coffee I managed to understand/fix it. You can have a look.

I gave some feedback on the last commit. I think that feedback applies on more parts of the codebase.

I think I understood those comments but maybe not so let's check. What I think you are saying is that the API/library should avoid taking any references to data that will then be cloned and should instead use interfaces like Into<String> or IntoInterator. Is that correct or am I missing something else?

@jmcnamara
Copy link
Owner Author

jmcnamara commented May 6, 2023

I've also added write_row_matrix() and write_column_matrix() methods for 2D arrays/iterators of data. See the updated example above.

jmcnamara added a commit that referenced this issue May 6, 2023
jmcnamara added a commit that referenced this issue May 6, 2023
jmcnamara added a commit that referenced this issue May 6, 2023
jmcnamara added a commit that referenced this issue May 7, 2023
jmcnamara added a commit that referenced this issue May 7, 2023
jmcnamara added a commit that referenced this issue May 7, 2023
jmcnamara added a commit that referenced this issue May 7, 2023
jmcnamara added a commit that referenced this issue May 7, 2023
jmcnamara added a commit that referenced this issue May 7, 2023
@adriandelgado
Copy link
Contributor

I think I understood those comments but maybe not so let's check. What I think you are saying is that the API/library should avoid taking any references to data that will then be cloned and should instead use interfaces like Into<String> or IntoInterator. Is that correct or am I missing something else?

That's right. The caller of the function has the responsability to clone if needed. Sometimes the caller already has owned data and knows it wont need it anymore, no clone needed in that case.

@jmcnamara
Copy link
Owner Author

I've implemented a Formula type so that it can be used in the generic write():

    worksheet.write(0, 0, Formula::new("=1+2+3"))?;
    worksheet.write(1, 0, Formula::new("=SIN(PI())"))?;

However, I'm not sure how much of a help this is yet.

@jmcnamara
Copy link
Owner Author

jmcnamara commented May 10, 2023

I had hoped to be able to use Trait Objects and the newer array writing methods to simulate something like writing a heterogeneous vec of vectors like a dataframe.

Something like this:

    let data: Vec<Vec<Box<dyn IntoExcelData>>> = vec![
        vec![
            Box::new("East"),
            Box::new("West"),
            Box::new("North"),
            Box::new("South"),
        ],
        vec![
            Box::new(40),
            Box::new(40),
            Box::new(20),
            Box::new(30)],
        vec![
            Box::new(30),
            Box::new(30),
            Box::new(50),
            Box::new(30)],
        vec![
            Box::new(Formula::new("=B1+C1")),
            Box::new(Formula::new("=B2+C2")),
            Box::new(Formula::new("=B3+C3")),
            Box::new(Formula::new("=B4+C4")),
        ],
    ];

However, I've failed in every attempt so far. Examples like this with a simpler data structure:

use rust_xlsxwriter::{Formula, IntoExcelData, Workbook, XlsxError};

fn main() -> Result<(), XlsxError> {
    // Create a new Excel file object.
    let mut workbook = Workbook::new();

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    let data: Vec<Box<dyn IntoExcelData>> = vec![
        Box::new("East"),
        Box::new(50),
        Box::new(30),
        Box::new(Formula::new("=B1+C1")),
    ];

    /* The following doesn't work.

        --> examples/app_trait_array.rs:39:9
        |
     40 |         item.write(worksheet, row, 1);
        |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the size of `dyn IntoExcelData`
                                                cannot be statically determined

    */
    let mut row = 1;
    for item in data {
        item.write(worksheet, row, 1);
        row += 1;
    }

    // Save the file to disk.
    workbook.save("arrays.xlsx")?;

    Ok(())
}

Or this:

    let mut row = 1;
    for item in data {
        worksheet.write(row, 0, item)?;
        row += 1;
    }

Which gives errors like this:

   --> examples/app_trait_array3.rs:28:33
    |
28  |         worksheet.write(row, 0, item)?;
    |                   -----         ^^^^ the trait `IntoExcelData` is not implemented 
                                          for `Box<dyn IntoExcelData>`
    |                   |
    |                   required by a bound introduced by this call
    |
    = help: the following other types implement trait `IntoExcelData`:

Anyone see a way that trait objects for the IntoExcelData trait might be handled?

Update: I'm still interested in this if there is a solution (there may not be a clean solution) but I'm going to shift my focus to working with real Dataframes from Polars.

@jmcnamara
Copy link
Owner Author

See also this example of writing a Polars dataframe to Excel using rust_xlsxwriter: #39

jmcnamara added a commit that referenced this issue May 19, 2023
jmcnamara added a commit that referenced this issue May 22, 2023
@jmcnamara
Copy link
Owner Author

These generic methods and types are now available in release v0.39.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants