New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdf graphs to csv #293

Open
Derek-Jones opened this Issue May 2, 2018 · 5 comments

Comments

Projects
None yet
5 participants
@Derek-Jones

Derek-Jones commented May 2, 2018

[ Project Contact ] @Derek-Jones
[ GitHub Repo ] https://github.com/Derek-Jones/pdf-2-csv
[ Track ] Openness,
[ Location ] London, England
[ Coach ] @SamanthaHindle & @aariops

Description

Data within pdf documents is sometimes displayed in graphs, e.g., many points with x/y coordinates in a plot. When plots are created using pdf operations, it is possible to extract the underlying values (a proof of concept is available). This project aims to add an option to Mozilla's PDF.js renderer to extract the data contained in the clicked plot.


Want to contribute to this project during #mozsprint?

Join us at the Global Sprint, May 10-11. Leave a comment below if you're interested in contributing to this project during #mozsprint 2018!

@aariops

This comment has been minimized.

Show comment
Hide comment
@aariops

aariops May 3, 2018

Collaborator

Hi @Derek-Jones

Welcome to Mozilla's Global Sprint. 🎉

Let me and @SamanthaHindle know if you need help with anything. 😄

Collaborator

aariops commented May 3, 2018

Hi @Derek-Jones

Welcome to Mozilla's Global Sprint. 🎉

Let me and @SamanthaHindle know if you need help with anything. 😄

@SamanthaHindle

This comment has been minimized.

Show comment
Hide comment
@SamanthaHindle

SamanthaHindle May 7, 2018

Hi @Derek-Jones

I really like your project! It looks like it will be a really useful tool to promote the reuse of data. I imagine a lot of researchers would be happy to help you with your project and so it might be worth generating a few "Good First Issues" to help invite contributors of all abilities. What do you think?

Let me and @aariops know if you would like some help!

Are you going to be at an official Sprint site in London, UK, for the Global Sprint? Hopefully we'll get to meet you virtually! 😃

SamanthaHindle commented May 7, 2018

Hi @Derek-Jones

I really like your project! It looks like it will be a really useful tool to promote the reuse of data. I imagine a lot of researchers would be happy to help you with your project and so it might be worth generating a few "Good First Issues" to help invite contributors of all abilities. What do you think?

Let me and @aariops know if you would like some help!

Are you going to be at an official Sprint site in London, UK, for the Global Sprint? Hopefully we'll get to meet you virtually! 😃

@Derek-Jones

This comment has been minimized.

Show comment
Hide comment
@Derek-Jones

Derek-Jones May 7, 2018

Hi @SamanthaHindle and @aariops,

I'm pleased you think there will be lots of interest. Figuring out what's going on inside a pdf file can be rather complicated. This is a somewhat techie project.

Somebody who has previously worked with PDF.js would be a great person to have on the team.

I plan to be at the Mozilla offices in London (there are sprints at two sites in London; any idea which will contain the most people?)

I am looking at getting on the featured list and have created a project graphic (see project repo; no, I am not a graphics person).

Derek-Jones commented May 7, 2018

Hi @SamanthaHindle and @aariops,

I'm pleased you think there will be lots of interest. Figuring out what's going on inside a pdf file can be rather complicated. This is a somewhat techie project.

Somebody who has previously worked with PDF.js would be a great person to have on the team.

I plan to be at the Mozilla offices in London (there are sprints at two sites in London; any idea which will contain the most people?)

I am looking at getting on the featured list and have created a project graphic (see project repo; no, I am not a graphics person).

@dwhly

This comment has been minimized.

Show comment
Hide comment
@dwhly

dwhly May 7, 2018

@Derek-Jones Just a note here that I know @petermr and @ContentMine did work in this area. Not sure if they have libraries that might be useful.

dwhly commented May 7, 2018

@Derek-Jones Just a note here that I know @petermr and @ContentMine did work in this area. Not sure if they have libraries that might be useful.

@Derek-Jones

This comment has been minimized.

Show comment
Hide comment
@Derek-Jones

Derek-Jones May 7, 2018

@dwhly, thanks for the pointers. @ContentMine extracts data points from svg files; inkscape can be used to manually highlight an image in a pdf and save it as an svg file (Contentmine is linked on the pdf2csv README

Derek-Jones commented May 7, 2018

@dwhly, thanks for the pointers. @ContentMine extracts data points from svg files; inkscape can be used to manually highlight an image in a pdf and save it as an svg file (Contentmine is linked on the pdf2csv README

@acabunoc acabunoc added this to the global sprint 2018 milestone May 8, 2018

@Derek-Jones Derek-Jones referenced this issue May 10, 2018

Open

Qualitative Data Analysis with Zotero #324

2 of 6 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment