This repository has been archived by the owner on Jan 13, 2022. It is now read-only.
[API Integration - TEXT] Unglue.it #193
Labels
✨ goal: improvement
Improvement to an existing feature
providers
🙅 status: discontinued
Not suitable for work as repo is in maintenance
This is a provider of texts and is therefore blocked by the Catalog not being ready to ingest that content type at this time
Provider API Endpoint / Documentation
https://unglue.it/api/help
Internal users only: CC has an API key for this service, please check CC's password manager.
Provider description
A provider of openly licensed ebooks, some of which are available from Project Gutenberg.
Licenses Provided
They indicate that the works on their site as CC licensed or have another open license. We'd need to restrict ingestion to CC licenses.
Provider API Technical info
There isn't a clear way for a frontend user to filter books on the site by license type.
The basic API documentation doesn't include license info at the high level:
https://unglue.it/api/v1/?format=json
However, they reference an ONIX structure, where rights information is returned in the Epub License field:
CC BY-NC-ND
01
https://creativecommons.org/licenses/by-nc-nd/3.0/
For example:
https://unglue.it/api/onix/by-nc-nd/epub/?max=20
More work is needed to determine if we can get all the information we need for ingestion
General Recommendations for implementation
src/cc_catalog_airflow/dags/provider_api_scripts/
directory.ImageStore
class (Import this fromsrc/cc_catalog_airflow/dags/provider_api_scripts/common/storage/image.py
).DelayedRequester
class (Import this fromsrc/cc_catalog_airflow/dags/provider_api_scripts/common/requester.py
).src/cc_catalog_airflow/dags/provider_api_scripts/modules/etlMods.py
, sincethat module is deprecated.
the script should take a
--date
parameter when run as a script, giving thedate for which we should collect images. The form should be
YYYY-MM-DD
(so,the script can be run via
python my_favorite_provider.py --date 2018-01-01
).the CLI. In our example from above, we'd then have a main function
my_favorite_provider.main(date)
. The main should do the same thing callingfrom the CLI would do.
pycodestyle
(available viapip install pycodestyle
) to check for compliance.appropriate (e.g., long strings for testing).
Examples of other Provider API Scripts
For example Provider API Scripts and accompanying test suites, please see
src/cc_catalog_airflow/dags/provider_api_scripts/flickr.py
andsrc/cc_catalog_airflow/dags/provider_api_scripts/test_flickr.py
, orsrc/cc_catalog_airflow/dags/provider_api_scripts/wikimedia_commons.py
andsrc/cc_catalog_airflow/dags/provider_api_scripts/test_wikimedia_commons.py
.The text was updated successfully, but these errors were encountered: