Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LaTeX reader fails to read LaTeX table with astropy.units that was written with LaTeX writer #5205

Open
wmwv opened this issue Jul 28, 2016 · 11 comments
Labels

Comments

@wmwv
Copy link
Contributor

wmwv commented Jul 28, 2016

LaTeX reader fails to read LaTeX table with astropy.units that was written with LaTeX writer.

Round-tripping astropy.units should be added to the astropy table LaTeX writing/reading test suite.

@wmwv
Copy link
Contributor Author

wmwv commented Jul 28, 2016

This is with AstroPy 1.2.1

Example test case:

from astropy.table import Table
import astropy.units as u

test_table = Table([[0, 1], [10, 20]], names=('x', 'y'))

test_filename = 'test_table.tex'
test_table.write(test_filename, format='ascii.aastex')
this_read_works = Table.read(test_filename, format='ascii.aastex')

test_table['x'].unit = u.day
test_units_filename = 'test_units_table.tex'
test_table.write(test_units_filename, format='ascii.aastex')
this_read_fails = Table.read(test_units_filename, format='ascii.aastex')

@hamogu
Copy link
Member

hamogu commented Aug 10, 2016

I will fully admit that it would be nice to round-trip everything. However, when I wrote the LaTex Reader/Writer classes, I consciously decided not to worry too much about round-tripping. Nobody should be using LaTeX as a data storage format. The use cases I had in mind are these:

  • Have a table that needs to be included in a publication. Most people will probably copy& paste into their LaTeX document and modify by hand. So, LaTeX writer needs to preserve as much info as possible (e.g. units).
  • Find some giant data table in a pdf in an article that has no electronic table. Download the LaTeX file from astro-ph and parse that file. In my experience these giant tables tend to be in the appendix of papers and tend to be formatted fairly simple. I anticipated that in most cases people would do that for one or two tables at most. If the reading does not work (e.g. units in the header or page breaks or other mark-up in the table) just edit in emacsl/vi/... before reading.

This is not to say that I won't be thrilled if somebody implements a complete LaTeX table parser, I'm just saying that round-tripping LaTeX tables does not seem very important to me because LaTeX tables tend to be one-way (write and send to publisher) and are generally not used for saving your results to disk and reading them back in tomorrow.

Out of curiosity: Do you actually need to round-trip LaTeX tables or did you stumble on this by accident and just wondered why it did not work?

@wmwv
Copy link
Contributor Author

wmwv commented Aug 10, 2016

While my CS spirit agrees that LaTeX tables are a non-ideal serialization format, my astronomer spirit has a specific use case for using LaTeX tables:

The results of a paper should be reproducible.

  1. The ideal of "reproducible" includes things such as
    • run make and regenerate all of the tables, plots, and numbers from some reasonable intermediate data product (e.g., photometric catalogs; or even 2D detrended images.).
      I'm running into all of this because I'm trying to make sure that all plots in the data paper I'm writing right now are directly reproducible from the tables being published. In the past I've tried to implement the above by separately generated the LaTeX tables, the plots, the summary numbers. Every time I've had a step that is manual, e.g., copy and paste into a large LaTeX document; hand-edit some line; then I've been bitten by some small disagreement as things got out of sync.
    • Someone else should be able download the tables as published in a paper and redo the plots and the calculations relatively straightforwardly. The higher the barrier to doing this, in time and required knowledge, the less effectively reproducible the research is. If the LaTeX files in my paper can't be read by AstroPy, then someone else downloading the tables has to do something to them in order to read them. That makes things more prone to error. It also takes something that you can give a summer undergrad to do in an afternoon and makes it take a week.

@wmwv
Copy link
Contributor Author

wmwv commented Aug 10, 2016

The AstroPy.Tables documentation currently claims that the LaTeX read/writer has the property that "it can read the tables that it writes."

http://docs.astropy.org/en/stable/api/astropy.io.ascii.Latex.html#astropy.io.ascii.Latex

@hamogu
Copy link
Member

hamogu commented Aug 10, 2016

I use a script to generate tables and plots form the same source for exactly that reason. I keep the data as e.g. fits and then generate both the LaTeX tables and the figures form the same tex file to keep it in sync.
I also strongly advocate to submit relevant tables to the journals to be published as "electronic table" and ot to rely on people getting the LaTeX source from astro-ph to parse the data; still, I see where you are coming from and I hope that somebody will implement these features eventually.

@wmwv
Copy link
Contributor Author

wmwv commented Aug 11, 2016

I'm happy to help and do some real work to address this issue as well as #5160

I've forked and started a new branch at https://github.com/wmwv/astropy/tree/test-read-write to add a read-write test for all things currently tested in the astropy/io/ascii/tests/test_write.py

@hamogu
Copy link
Member

hamogu commented Aug 11, 2016

That is great. When you think it's ready, just open a pull request for
it and let me know so I can look it over.
Thank you so much for your help!

On 8/11/16 9:20 AM, Michael Wood-Vasey wrote:

I'm happy to help and do some real work to address this issue as well as
#5160 #5160

I've forked and started a new branch at
https://github.com/wmwv/astropy/tree/test-read-write to add a read-write
test for all things currently tested in the
|astropy/io/ascii/tests/test_write.py|


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#5205 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAecAGm5nJpqC9wiBKPoAlba6i2h6it6ks5qeyGsgaJpZM4JXQfx.

@mhvk mhvk removed the units label Aug 11, 2016
@mhvk
Copy link
Contributor

mhvk commented Aug 11, 2016

(Removed the units label, since this is not something the units module can help with)

@wmwv
Copy link
Contributor Author

wmwv commented Aug 11, 2016

@hamogu I have a partial solution for reading tables with units. It does not actually units, but will read in the colnames, which is at least a partial solution. #5237

@pllim
Copy link
Member

pllim commented Feb 10, 2020

I just ran into this issue with Astropy 4.0. The behavior was surprising to me. I had to do this as a workaround for my data:

result_tex = Table.read('my_result.tex', format='ascii.latex', data_start=4)

@keflavich
Copy link
Contributor

I'd like to bump this issue.

The documentation states:
"This class can also read simple LaTeX tables (one line per table row, no \multicolumn or similar constructs), specifically, it can read the tables that it writes."

The table reader can't read back exactly what it printed if that includes units. We should modify the documentation to say something like, "The reader can read back LaTeX tables it creates, but it will not properly handle units, and additional commands may need to be added to the ignore_latex_commands keyword."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants