Working implementation of journal meta data using openalex. Passes te…#168
Working implementation of journal meta data using openalex. Passes te…#168
Conversation
…anagement for journal entries. Todo: fix geometries, add tests.
…sts, needs review.
|
@nuest I think the changes I made to add the journal functionality might be causing issues with harvesting in tasks.py. Would it be okay to change some of the previous code a little to fix the conflicts and get everything running smoothly? |
|
Yes, of course. Please fix the tests. I can nevertheless make a first review even with the current changes. Tomorrow. |
nuest
left a comment
There was a problem hiding this comment.
Thanks! This is a major update, so let's try to get things as right as possible:
Let's change the name from "journals" to "sources", in line with the entity names in OpenAlex, see https://docs.openalex.org/api-entities/entities-overview
We certainly also want to geolocate preprint articles, and then "journal = preprint server" makes so sense.
"publications" also is not a good name? Right, we'll get to that in #169
fixtures/test_data.json
Outdated
| "model": "publications.journal", | ||
| "pk": 1, | ||
| "fields": { | ||
| "name": "Nature", |
There was a problem hiding this comment.
I understand where you're coming from, but this is actually not what our values are.
If we want to use real journals here because their ISSN actually exists, then we go with diamond open access journals (https://en.wikipedia.org/wiki/Diamond_open_access).
https://github.com/loreabad6/doaj-geo is a good starting point, so let's use some that we might also want to collaborate with. Please add the following journals to the test data:
…tional fields for sources. Testing ongoing.
|
These are the errors I am having the most difficulty resolving |
publications/tasks.py
Outdated
|
|
||
| if src and getattr(src, "is_preprint", False) and geom.empty: | ||
| try: | ||
| loc = Nominatim(user_agent="optimap-tasks").geocode( |
There was a problem hiding this comment.
Where does the geocoding come from? Can you please clarify which issue this implements?
publications/tasks.py
Outdated
| ) | ||
|
|
||
| ps_list, pe_list = extract_timeperiod_from_html(soup) | ||
| except Exception: |
|
This my my local test output: |
|
Failed remaining test obsoleted by 2fa0215 |
Closes #5
Data Model Changes
Added a new Journal model with fields name, issn_l, openalex_id, openalex_url, publisher_name, works_count, works_api_url and converted Publication.source from a text field to a ForeignKey pointing at this Journal model .
Serializers and ViewSets
Created a JournalSerializer exposing all eight journal fields, then updated PublicationSerializer (a GeoFeatureModelSerializer) to include a nested field source_details = JournalSerializer(source="source"). The PublicationViewSet filters out any publications lacking a valid source or null geometry so that GeoJSON serialization (via DRF‐GIS) never fails .
OpenAlex Sync Command
Wrote a management command update_openalex_journals that, for each Journal with a non-null ISSN-L, fetches metadata from OpenAlex (using requests.get("https://api.openalex.org/sources/issn:")). It populates or updates each journal’s openalex_id, openalex_url, publisher_name, works_count, and works_api_url, saving only when changes occur.
Test Coverage for Journals + Publications
TODO: