Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign uppdf graphs to csv #293
Comments
acabunoc
added
the
[Track] Openness
label
May 2, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
aariops
May 3, 2018
Collaborator
Hi @Derek-Jones
Welcome to Mozilla's Global Sprint.
Let me and @SamanthaHindle know if you need help with anything.
|
Hi @Derek-Jones Welcome to Mozilla's Global Sprint. Let me and @SamanthaHindle know if you need help with anything. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SamanthaHindle
May 7, 2018
Hi @Derek-Jones
I really like your project! It looks like it will be a really useful tool to promote the reuse of data. I imagine a lot of researchers would be happy to help you with your project and so it might be worth generating a few "Good First Issues" to help invite contributors of all abilities. What do you think?
Let me and @aariops know if you would like some help!
Are you going to be at an official Sprint site in London, UK, for the Global Sprint? Hopefully we'll get to meet you virtually!
SamanthaHindle
commented
May 7, 2018
|
Hi @Derek-Jones I really like your project! It looks like it will be a really useful tool to promote the reuse of data. I imagine a lot of researchers would be happy to help you with your project and so it might be worth generating a few "Good First Issues" to help invite contributors of all abilities. What do you think? Let me and @aariops know if you would like some help! Are you going to be at an official Sprint site in London, UK, for the Global Sprint? Hopefully we'll get to meet you virtually! |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Derek-Jones
May 7, 2018
Hi @SamanthaHindle and @aariops,
I'm pleased you think there will be lots of interest. Figuring out what's going on inside a pdf file can be rather complicated. This is a somewhat techie project.
Somebody who has previously worked with PDF.js would be a great person to have on the team.
I plan to be at the Mozilla offices in London (there are sprints at two sites in London; any idea which will contain the most people?)
I am looking at getting on the featured list and have created a project graphic (see project repo; no, I am not a graphics person).
Derek-Jones
commented
May 7, 2018
|
Hi @SamanthaHindle and @aariops, I'm pleased you think there will be lots of interest. Figuring out what's going on inside a pdf file can be rather complicated. This is a somewhat techie project. Somebody who has previously worked with PDF.js would be a great person to have on the team. I plan to be at the Mozilla offices in London (there are sprints at two sites in London; any idea which will contain the most people?) I am looking at getting on the featured list and have created a project graphic (see project repo; no, I am not a graphics person). |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
dwhly
May 7, 2018
@Derek-Jones Just a note here that I know @petermr and @ContentMine did work in this area. Not sure if they have libraries that might be useful.
dwhly
commented
May 7, 2018
|
@Derek-Jones Just a note here that I know @petermr and @ContentMine did work in this area. Not sure if they have libraries that might be useful. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Derek-Jones
May 7, 2018
@dwhly, thanks for the pointers. @ContentMine extracts data points from svg files; inkscape can be used to manually highlight an image in a pdf and save it as an svg file (Contentmine is linked on the pdf2csv README
Derek-Jones
commented
May 7, 2018
|
@dwhly, thanks for the pointers. @ContentMine extracts data points from svg files; inkscape can be used to manually highlight an image in a pdf and save it as an svg file (Contentmine is linked on the pdf2csv README |
Derek-Jones commentedMay 2, 2018
•
edited by acabunoc
Edited 8 times
-
acabunoc
edited May 8, 2018 (most recent)
-
Derek-Jones
edited May 3, 2018
-
Derek-Jones
edited May 3, 2018
-
Derek-Jones
edited May 3, 2018
-
Derek-Jones
edited May 3, 2018
-
Derek-Jones
edited May 3, 2018
-
Derek-Jones
edited May 3, 2018
-
Derek-Jones
edited May 2, 2018
[ Project Contact ] @Derek-Jones
[ GitHub Repo ] https://github.com/Derek-Jones/pdf-2-csv
[ Track ] Openness,
[ Location ] London, England
[ Coach ] @SamanthaHindle & @aariops
Description
Data within pdf documents is sometimes displayed in graphs, e.g., many points with x/y coordinates in a plot. When plots are created using pdf operations, it is possible to extract the underlying values (a proof of concept is available). This project aims to add an option to Mozilla's PDF.js renderer to extract the data contained in the clicked plot.
Want to contribute to this project during #mozsprint?
Join us at the Global Sprint, May 10-11. Leave a comment below if you're interested in contributing to this project during #mozsprint 2018!