Feature/DBLP abstract extraction process change #1993

xkopenreview · 2024-01-24T18:57:46Z

this pr should update the process function

DBLP.org/-/Record to add abstract (if html is available and the abstract is extracted) when a paper is imported
DBLP.org/-/Abstract to allow manual triggering of abstract extraction

melisabok · 2024-01-24T19:56:55Z

openreview/profile/process/dblp_abstract_process.js

@@ -0,0 +1,28 @@
+async function process(client, edit, invitation) {


why the abstract invitation should have a process function to extract the abstract?

This is invitation is being to set the abstract value

to allow manual triggering of abstract extraction when the process function in DBLP.org/-/Record fail

but the abstract in the edit would be empty or what?

if the edit has a value, do you overwrite it?

and it is seems to be an infinite loop here, you are posting an edit with the same invitation of the process function.

melisabok · 2024-01-24T19:57:36Z

openreview/profile/process/dblp_record_process.js

+
+  const html = note.content.html?.value;
+
+  if (html) {


this is great! should we do the same to extract the PDF link?

is it necessary to extract the PDF link?
from what i understand it's just the redirected url of html, not the link to a .pdf file

I'm not sure but Andrew wants to complete the PDF value with the link to the PDF not to a webpage.

xkopenreview · 2024-02-23T19:22:38Z

to test importing many dblp papers in v2, existing notes in v1 need to be removed (so that there's no match)
the notes index needs to be created again so that the note can't be found in elasticsearch and will be treated as a new note

melisabok · 2024-02-23T20:42:32Z

can you try Andrew's profile? there are a lot of new publications that were not imported or we can pick another profile that doesn't exist in the dev site and create it.

…ord_process.js

…rocesses

xkopenreview · 2024-03-19T15:04:57Z

discussed with @carlosmondra again about this and decided to:

process function call Tools.extractAbstract(html url of dblp paper)
Tools.extractAbstract is a wrapper of fetch which make the call to cloud function url+html url of dblp paper and return json object (which contains abstract,pdf,error)
cloud function url is set as an environment var in image of cloud function and Tools.extractAbstract will read that environment variable

so that there's no dependency between meta-extraction package and api and both fetch and cloud function url are not exposed.

melisabok · 2024-03-19T16:35:40Z

In order the test to pass we may need to mock the service or something similar.

Maybe if extractAbstract can not reach the cloud function, we shouldn't throw an error?

xkopenreview · 2024-03-19T16:58:14Z

discussed with Carlos again about this
decided to add the cloud function url in openreview-js

xkopenreview · 2024-03-22T17:15:14Z

the test failure looks random to me
the following tests which circleci was reporting error are passed in my local

test_icml_conference.py
test_venue_request_v2
test_venue_request_v2

@melisabok can you help to rerun the test

melisabok · 2024-03-22T20:16:29Z

checking now.

melisabok · 2024-03-22T21:05:01Z

Some API 1 tests are failing because I think time.sleep(0.5) is not enough to wait for the button to be clickable. I increased it.

melisabok · 2024-03-25T14:44:12Z

A try/catch was missing in the abstract process function, tests should pass now.

xkopenreview added 5 commits January 24, 2024 13:38

update process function of dblp record and abstract

3def324

fix format

b08ff26

fix format

afa13f1

fix format

bdb3f08

fix format

34c393f

xkopenreview marked this pull request as draft January 24, 2024 18:57

melisabok reviewed Jan 24, 2024

View reviewed changes

xkopenreview and others added 3 commits January 25, 2024 15:43

update process function

9d109ca

Merge branch 'master' into feature/dblp-abstract-process

930f66a

Merge branch 'master' into feature/dblp-abstract-process

2a13aad

xkopenreview and others added 4 commits February 23, 2024 15:52

Refactor abstract extraction in dblp_abstract_process.js and dblp_rec…

3bbe037

…ord_process.js

Add maxReplies 1000 to DBLP.org/-/Record

caf92af

Add console logs and error handling for abstract extraction in dblp p…

be352da

…rocesses

Merge branch 'master' into feature/dblp-abstract-process

40f4bff

melisabok marked this pull request as ready for review March 8, 2024 21:44

use cloud function

9b7be9f

xkopenreview and others added 2 commits March 19, 2024 11:15

Merge branch 'master' into feature/dblp-abstract-process

cb2f9e0

update dblp process function based on discussion

c963080

xkopenreview and others added 2 commits March 21, 2024 18:21

parse json result

cf439c0

Merge branch 'master' into feature/dblp-abstract-process

e08123c

give more time to click the element

ec60d0e

add try and catch

a5c48fc

Merge branch 'master' into feature/dblp-abstract-process

f426f0b

xkopenreview mentioned this pull request Mar 27, 2024

Fix/ Use API 2 to import DBLP papers openreview/openreview-web#1716

Merged

xkopenreview and others added 2 commits March 27, 2024 14:21

Merge branch 'master' into feature/dblp-abstract-process

0ca6360

Merge branch 'master' into feature/dblp-abstract-process

e9f2071

melisabok approved these changes Mar 28, 2024

View reviewed changes

Merge branch 'master' into feature/dblp-abstract-process

ac360e5

melisabok merged commit fd22f09 into master Apr 1, 2024
1 check passed

melisabok deleted the feature/dblp-abstract-process branch April 1, 2024 19:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/DBLP abstract extraction process change #1993

Feature/DBLP abstract extraction process change #1993

xkopenreview commented Jan 24, 2024

melisabok Jan 24, 2024

xkopenreview Jan 24, 2024

melisabok Jan 24, 2024

melisabok Jan 24, 2024

xkopenreview Jan 24, 2024

melisabok Jan 24, 2024

xkopenreview commented Feb 23, 2024

melisabok commented Feb 23, 2024

xkopenreview commented Mar 19, 2024

melisabok commented Mar 19, 2024

xkopenreview commented Mar 19, 2024

xkopenreview commented Mar 22, 2024

melisabok commented Mar 22, 2024

melisabok commented Mar 22, 2024

melisabok commented Mar 25, 2024

		@@ -0,0 +1,28 @@
		async function process(client, edit, invitation) {

Feature/DBLP abstract extraction process change #1993

Feature/DBLP abstract extraction process change #1993

Conversation

xkopenreview commented Jan 24, 2024

melisabok Jan 24, 2024

Choose a reason for hiding this comment

xkopenreview Jan 24, 2024

Choose a reason for hiding this comment

melisabok Jan 24, 2024

Choose a reason for hiding this comment

melisabok Jan 24, 2024

Choose a reason for hiding this comment

xkopenreview Jan 24, 2024

Choose a reason for hiding this comment

melisabok Jan 24, 2024

Choose a reason for hiding this comment

xkopenreview commented Feb 23, 2024

melisabok commented Feb 23, 2024

xkopenreview commented Mar 19, 2024

melisabok commented Mar 19, 2024

xkopenreview commented Mar 19, 2024

xkopenreview commented Mar 22, 2024

melisabok commented Mar 22, 2024

melisabok commented Mar 22, 2024

melisabok commented Mar 25, 2024