Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser #8

Open
venkatr123 opened this issue Jul 17, 2018 · 7 comments
Open

Parser #8

venkatr123 opened this issue Jul 17, 2018 · 7 comments

Comments

@venkatr123
Copy link

For Different Formats Type Resumes like doc , pdf ....etc i'am getting error as below

{ Error: Error for type: [[ application/pdf ]], file: [[ d:\code4goal-resume-parser-master/public/2016_JRM_Resume.pdf ]], extractor for type exists, but failed to initialize. Message: INFO: 'pdftotext' does not appear to be installed, so textract will be unable to extract PDFs.
at extract (d:\code4goal-resume-parser-master\node_modules\textract\lib\extract.js:147:15)
at Timeout._onTimeout (d:\code4goal-resume-parser-master\node_modules\textract\lib\extract.js:155:7)
at ontimeout (timers.js:466:11)
at tryOnTimeout (timers.js:304:5)
at Timer.listOnTimeout (timers.js:267:5) typeNotFound: true }
Error: antiword read of file named [[ Abhilash_Reddy - Copy.doc ]] failed: Error: Command failed: antiword -m UTF-8.txt "d:\code4goal-resume-parser-master/public/Abhilash_Reddy - Copy.doc"
'antiword' is not recognized as an internal or external command,
operable program or batch file.

Is there any extractor for all formats

@likerRr
Copy link
Owner

likerRr commented Jul 19, 2018

Not sure. Moreover part of them could be out of date because code4goal-resume-parser itself didn't receive any updates for a long time. Sorry

@nrsharma11
Copy link

I have resolved the issue related to "pdftotext" but I am still facing issue "antiword read of file named".
Can you please suggest or help in this regards.
Thanks in advance.

@likerRr
Copy link
Owner

likerRr commented Aug 8, 2019

Great to hear! Can you send a PR with a fix?
Do you have any issues with other doc files? Or only with that one?

@nrsharma11
Copy link

To resolve error "pdftotext" I have downloaded the xpdf tools from here. Copied the pdftotext.exe in windows folder.

Yes I am facing the issue with all the doc files, it keep saying "antiword read of file named" BUT interestingly when I save the same file as ".docx" extention then it processed and I got the results. So may be there is something to do with the doc files.

I am also interested to have linked in profile based on public profile url, I tried but I am not getting results it shows blank nodes in json, please have a look below
"linkedin": { "positions": { "past": [ ], "current": { "title": "", "company": "", "description": "", "period": "" } }, "languages": [ ], "skills": [ ], "educations": [ ], "volunteering": [ ], "volunteeringOpportunities": [ ] }

@likerRr
Copy link
Owner

likerRr commented Aug 8, 2019

Since the parser was made, linked's html or api could change. Sorry, I don't support this project for now and can't have a look. There is a fork of my project with lots of issues fixed. Can you try it and see if your issues are fixed?

@nrsharma11
Copy link

Okay. Thanks for your reply.

@fadiajabeen
Copy link

I placed pdftotext.exe in root folder i.e. from where app.js file is being run, and now its working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants