Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mx_api_content() fails if the last page doesn't contain any records #19

Closed
mcguinlu opened this issue Jan 4, 2021 · 2 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@mcguinlu
Copy link
Member

mcguinlu commented Jan 4, 2021

This seems to be due to the fact that the number of records given by the "total" metadata is more than the total number of records actually available.

As of 14.39pm on 04/01/2021, the number of records given by the "total" is 148231. However, if you set the counter to any record within 31 of this figure (e.g. https://api.biorxiv.org/details/biorxiv/2013-01-01/2021-01-04/148201), you get a "No posts found" message. As medrxivr uses the "total" metadata field to work out how many pages it needs to cycle through to download the whole database, this sometimes leads to an error when the last page, expected by medrxivr based on the "total" field, is empty.

Note as more records are added to the API, the hardcode figures above will no longer demonstrate the issue.

@mcguinlu
Copy link
Member Author

mcguinlu commented Jan 5, 2021

I think the best way to address this is to prevent the api_to_df() helper from failing if the cursor user counter in the for loop is the same as the maximum pages value

medrxivr/R/mx_api.R

Lines 80 to 96 in 83e686b

for (cursor in 0:pages) {
page <- cursor * 100
page_link <- api_link(server,
from_date,
to_date,
format(page,
scientific = FALSE))
tmp <- api_to_df(page_link)
tmp <- tmp$collection
df <- rbind(df, tmp)
pb$tick(100)
}

@mcguinlu mcguinlu self-assigned this Jan 5, 2021
@mcguinlu mcguinlu added the bug Something isn't working label Jan 5, 2021
@mcguinlu
Copy link
Member Author

mcguinlu commented Feb 1, 2021

Update: I've written to the CSHL team to flag this issue and they are looking into. In the meantime, the solution proposed above is probably the best bet.

@mcguinlu mcguinlu pinned this issue Feb 16, 2021
@maelle maelle unpinned this issue Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant