Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pad table headers up to max row length to avoid syntax errors, closes… #36

Merged
merged 1 commit into from
Mar 30, 2018

Conversation

danse
Copy link
Contributor

@danse danse commented Nov 23, 2017

… #4059

Copy link
Contributor

@mb21 mb21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull needs a bit of work, but I think it's a good idea to pad the headers!

btw, this would close jgm/pandoc#4059

headers' = if null headers
then replicate numcols mempty
else headers
headers' = pad mempty numcols headers
Copy link
Contributor

@mb21 mb21 Nov 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the pad function is only used here, I would move it here: on the same level as headers' and numcols.

or even better, do without it:

headers' = take numcols $ headers ++ repeat mempty

@@ -1,5 +1,5 @@
Name: pandoc-types
Version: 1.17.3
Version: 1.17.4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jgm will increment the version number in a separate commit when he releases a new version... no need to do it in a pull request. (same for the changelogs AFAIK)

@@ -305,6 +305,16 @@ setDate = setMeta "date"

-- Inline list builders

-- | Extend a list with a default element up to a maximum length
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we aren't using doctest in this project... so I'd just add a few more tests to test-pandoc-types.hs

@danse
Copy link
Contributor Author

danse commented Nov 24, 2017

okay i added a commit with the changes you mentioned, ready to be squashed

@jgm
Copy link
Owner

jgm commented Nov 28, 2017

If I understand correctly, this pads headers to max row length, but doesn't pad rows. Should rows be padded too if they have fewer cells than other rows or the header?

@danse
Copy link
Contributor Author

danse commented Nov 28, 2017

yeah padding rows is a low-hanging fruit here, i'll add a test and the feature

@danse
Copy link
Contributor Author

danse commented Nov 29, 2017

i added row padding. some tests are failing but the error seems independent from the changes i made

@jgm
Copy link
Owner

jgm commented Dec 2, 2017

If I'm not mistaken, nothing prevents having a different number of columns in the body than in the cellspec (column widths and alignments). Is this something we should also sanitize?

One option would be to take the cellspec as determining the number of columns, and pad or truncate all rows to that number.

@danse
Copy link
Contributor Author

danse commented Dec 3, 2017

i'd say to pad also the specification. about the idea to use the specification as an indicator of the max size, i'd rather keep the current max length. i think that it's more sensible because:

  • the specification is metadata and the user does not see it on some source formats
  • picking the max size, no data is lost

@jgm
Copy link
Owner

jgm commented Dec 3, 2017 via email

@danse
Copy link
Contributor Author

danse commented Dec 4, 2017

if this is getting tricky, i'm not sure that we should keep discussing in this context. the original motivation of this pull request is to address issue jgm/pandoc#4059 which caused syntax errors to an user. from the point of view of that user, this pull request is already solving a problem. we also thought to extend this to rows, but now it seems like our desire to improve more is preventing us to improve the experience of that user

@jgm
Copy link
Owner

jgm commented Dec 4, 2017 via email

@danse
Copy link
Contributor Author

danse commented Dec 5, 2017

oh, i see, you were making a point about your proposal while i thought that we were stuck. let me reconsider more carefully

@danse danse force-pushed the jgm/pandoc#4059 branch 2 times, most recently from 442d858 to ed19be7 Compare January 15, 2018 15:52
@danse
Copy link
Contributor Author

danse commented Jan 15, 2018

i updated the pull request, now it pads or truncate depending on colspecs. i hope i got the Haddock right :)

@danse
Copy link
Contributor Author

danse commented Jan 25, 2018

@danse
Copy link
Contributor Author

danse commented Feb 12, 2018

i tried to run pandoc tests using this version of pandoc-types and 49 tests fail. many if not all failures could be due to the change of behaviour, but there could also be some error. i will propose this again when i will have a pull request ready for pandoc tests as well

@danse danse closed this Feb 12, 2018
danse added a commit to italia/pandoc that referenced this pull request Feb 12, 2018
@danse
Copy link
Contributor Author

danse commented Feb 12, 2018

alright it was not that bad, three tests were actually failing and here is the fix.

  • in Writers.Muse, empty specs caused truncation of the whole table content
  • in 2649, what seems to be a parsing or building error is propagated to all rows
  • in 3337, a table is sliced according to the specs the parser is probably passing to the builder

we have a problem similar to 3337 with the DOCX parser, i think that i will change the logic there so that the maximum width is used for the specs

@danse
Copy link
Contributor Author

danse commented Mar 19, 2018

i saw that there were some updates on master so i rebased this. this is ready as far as i can tell

@jgm jgm merged commit 5fead2b into jgm:master Mar 30, 2018
@jgm
Copy link
Owner

jgm commented Mar 30, 2018

Thanks!

@danse danse deleted the jgm/pandoc#4059 branch April 3, 2018 13:10
@danse
Copy link
Contributor Author

danse commented Apr 3, 2018

Thanks to you! Let me know whether i can help updating the tests in pandoc. I remind you that some info about how to update them and a commit with the changes are in the comment above

@jgm
Copy link
Owner

jgm commented Apr 5, 2018

OK, I started looking at pandoc tests.

In command test 3337, we have a table with no header and two body rows, one with two cells, one with one. The test output expects the table to be padded to two cells, but with this commit it's padded to one cell.

I'm actually starting to change my mind about the decision above; maybe padding to the maximum width makes the most sense? After all, that is what HTML will do.

@jgm
Copy link
Owner

jgm commented Apr 5, 2018

OK, looking back at my reasons above, they seem sound.
Maybe this can be handled with a change to the HTML reader.

@jgm
Copy link
Owner

jgm commented Apr 5, 2018

I'm going to make the needed changes in pandoc tests and pandoc.

jgm added a commit to jgm/pandoc that referenced this pull request Apr 5, 2018
In jgm/pandoc-types#36 we changed
the table builder to pad cells.  This commit changes tests
(and two readers) to accord with this behavior.
@danse
Copy link
Contributor Author

danse commented Apr 9, 2018

sorry for being late on this. In the case of 3337, as you said, the HTML reader is responsible for calling the builder with specs that will not lead to any data loss. Thanks for moving this forward, i'll verify that it closes #4059

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants