Skip to content
This repository has been archived by the owner on May 26, 2022. It is now read-only.

Add DateTime support to ODS writer #735

Closed
wants to merge 1 commit into from

Conversation

schrieveslaach
Copy link

No description provided.

@CLAassistant
Copy link

CLAassistant commented Mar 28, 2020

CLA assistant check
All committers have signed the CLA.

@@ -209,6 +209,11 @@ private function getCellXML(Cell $cell, $styleIndex, $numTimesValueRepeated)
$data .= ' office:value-type="string" calcext:value-type="error" office:value="">';
$data .= '<text:p>' . $cell->getValueEvenIfError() . '</text:p>';
$data .= '</table:table-cell>';
} elseif ($cell->isDate()) {
$dateStr = $cell->getValue()->format('Y-m-d');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spout should not decide the date format (Y-m-d is a very opinionated choice). I believe that this should be let to the user to convert the datetime to a formatted date (or more complex and unsupported at the moment: allow the user to specify the date format as part of the cell style)

Copy link
Collaborator

@adrilo adrilo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The date formatting is too opinionated. See inline comment

@adrilo adrilo closed this May 8, 2021
@schrieveslaach
Copy link
Author

I'm very confused that this PR and #781 have been closed instead of requesting changes or at least with a hint that the issue will be resolved in another approach. For me that is not community friendly.

CC: @ChronicLogic

@adrilo
Copy link
Collaborator

adrilo commented May 19, 2021

Hi Marc,

I was in the process of cleaning up issues and PRs and I could have taken more time explaining why I closed these issues. (Also remember that I'm doing this work on my free time, so I sometimes take shortcuts).

Handling DateTime, whether it's for ODS or XLSX, is a non trivial task. You need the user to define a date format somehow and make sure that the date is properly formatted, not as a string like you did, but as a proper date/timestamp. The PR was also missing tests.
So overall this PR (and the other one) was way too far from the target. That's why I decided to close it as opposed to give this explanation. Hope it's clearer now.

@schrieveslaach
Copy link
Author

Okay, thanks for clarifying. Do you have a plan to add these missing parts? Otherwise, I have to stay on my fork.

@ChronicLogic
Copy link

I'm not sure I understand the explanation. I've been writing software for nearly 30 years, and I've written many systems that deal with "non trivial" date values. Many of them had to deal with multiple users in different locales and across timezones, interacting in real time for time-sensitive actions. Point being I'm not in any way unaware that it takes a bit of care to do that right. In this case, box is a piece of some larger unknown system, a middle man. Internally box can store the date value in any way it wants as long as it does not change the data. What comes in must go out the same way. An iso 8601 string is a pretty good format. It's relatively compact. Strings are good in the sense that they are not effected by binary conversion issues (big/little endian or etc) and they don't depend on system buffer size (32/64/etc). It's a good and widely used standard format. The format stores all the necessary information, including timezone when it is applicable. Not all systems use timezone but in those cases it's not a problem since again it shouldn't change from in to out. The consumer who doesn't care about timezone simply needs to rely on the date and time not changing. Currently, without question box fails the most simple use case: You can't read in a spreadsheet and then write it back out. The data IS DEFINITELY changed or lost when it is datetime data. The pull request I submitted fixes that in my use case. The only question is: Is box internally consistent? If box will choose a format internally (I would again suggest iso 8601) and consistently apply that internally then there should be no other problem and it should enable to pass all the in/out tests you want to throw at it. The portions of box that interface with an external system (XLSX/ODF ETC) need only to translate from the internal value to the external value in a predictable way. I am not 100% sure if the xlsx storage format is based on iso 8601 but it seemed to be in my test, and given that the format is xml based that makes a lot of sense. I don't have time to review these issues for the breadth of the internal function nor each external interface. But someone should do that and I don't think it would take that long. Again without doing that, box fails the most basic of test cases. Read in, write out = FAIL. Data is lost. If the issues and pull requests are closed as a matter of housekeeping that's fine, but the date issues which currently fail tests really should be resolved in box master, or people (like @schrieveslaach) will be using a fork of some kind and that is not helpful to the master branch. That's the entire reason I submitted changes; even though I knew my changes were not 100% of the solution. At least with my change I can now pass the most basic of tests, as can many others, which would keep them on the master. I wish you good fortune and bid you farewell.

@adrilo
Copy link
Collaborator

adrilo commented May 21, 2021

While I agree with you that supporting DateTime would be a good thing, note that read then write is possible when using setShouldFormatDates(true). Upon reading the spreadsheet, it will transform the date value into a formatted string, that can then be written back (as a string though, not a date).
This option seems more correct than what you suggested, as the date format will remain the same as the original (and not always the iso-8601 format, which may or may not be the desired format).

Internally, XLSX stores date values as timestamps and the format somewhere else. So to write a date value, you would need to transform the date into a XLSX compatible timestamp (not the same as the usual timestamp) and write it into the "sheet value" file, then write the date format the user wants into the "style" file and finally reference the style id back in the "sheet value" file.
This is I believe the only proper way to bring this feature to life. Other attempts, while good, are not generic enough to support everybody's use cases. Also, this is how the Reader works, so it would make sense to keep a similar behavior for the Writer.

@schrieveslaach as for a plan to add this, TBH I won't have time to work on this any time soon I believe. Also, I'm trying to keep both XLSX and ODS features in sync so that adds even more complexity to add code for both writers...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants