-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/DBLP abstract extraction #29
Conversation
I'm of the idea that this change should have been created in a different library. Something like |
looks like dl.acm.org can auto detect umass ip so pdf link can be retrieved running the test in my local but not in github actions |
why would you need to extract the PDF? we only need the link to the PDF. |
correct. i mean the pdf link, not the pdf file |
so the page that contains the link has a pay wall? |
non-umass network (@carlosmondra) |
some statistics of existing dblp imports in v1 for 90% coverage, the following domains should be handled by a domain specific rule or the general rule
for 95% coverage, the following domains should be handled by a domain specific rule or the general rule
full list of domain and count
|
thanks for the status @xkopenreview ! I would like to use the dev site to test the DBLP upload using the process functions, could we do that? |
this pr should add a static function extractAbstract to Tools function to extract abstract and pdf based on the html of the dblp note which will be invoked by process function of DBLP.org/-/Record and DBLP.org/-/Abstract.