-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop blank columns and sheets when flattening #241
Conversation
This will optionally include those columns (from the schema) that don't have any lines of data.
@Bjwebb , if you look at the branch "cove-1019-no-blank-sheets-docs-and-tests" you'll see that I've added a bit to the docs. I also added a new test. Now, with the changes to examples/flatten/simple tests that you did and my new test we have a test for this mode being both off and on. Are you happy to accept that? If so, feel free to merge it to this branch/P.R. with any changes you want. I haven't looked at detail to the code changes you've made, but I trust you on those as long as we have some clear docs and tests for both modes to help any future developers working on this. |
@odscjames - @Bjwebb is unlikely to see or respond to your comment here. Good work, though! It sounds like this is ready to go. For the 4 CoVEs:
|
Thanks I can only see http://1019-no-blank-sheets.dev.cove.opendataservices.coop/validator/ to test for OCDS, so unsure if there should be a version for other CoVEs @odscjames @robredpath ? 360 - I can see from this : https://dataquality.threesixtygiving.org/data/59f09a1f-74cc-4427-a3cb-e9a6dd3426b7 - that a JSON file converted to a spreadsheet converted from JSON includes lots of extra sheets that are blank. Ill confirm with 360 ( @KDuerden @morchickit ) that we should only output populated sheets IATI XML -> xlsx : I'm not sure we have this behaviour. When I convert an XML file that does not cover all parts of teh standard: http://iati.cove.opendataservices.coop/data/3105173d-22a4-4522-8e66-b4056b3b7fd8 - I receive a spreadsheet that has populated sheets. There are no blank sheets. Is this expected @Bjwebb ? |
Re: 360 data JSON > xlsx - @calummacuisdean may have an opinion, as SCVO publish 360 as JSON. If I convert an SCVO JSON file to spreadsheet: https://dataquality.threesixtygiving.org/data/3b4a76c8-91aa-4786-b6cf-6c5d0e51fb3f --> is it useful to have all the blank fields from the standard, or safe to drop them, as per this PR...? |
I have added my doc and test directly into this pull request. |
@stevieflow yes for XML data the empty sheets will not be made as we do not look at the iati schema when flattening the data. So no extra columns/sheets are made. I am not sure what the correct default behaviour here is. This pull request changes the default behaviour to ignore empty sheets/columns that are in the schema, but maybe the correct default behaviour is to keep flattentool as it is and have the flag be For each standard in cove we can then decide if we would like to switch this on or not. |
Discussion with @kindly - I'm going to do a new pull request, keeping behaviour the same as he suggests |
Original work done by @Bjwebb #241 OpenDataServices/cove#1019
This has now been done by #268, and is now closed. |
Relevant issue is in CoVE repo: OpenDataServices/cove#1019
(ie. don't pull extra columns from the schema)
A
--empty-schema-columns
flag is provided for commandline users who want the previous behaviour.