-
Notifications
You must be signed in to change notification settings - Fork 55
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[ENG-4380][ENG-4382] iri-based ingest and search (#806)
* wip: index-card-search api * wip * wip * wip * wip * wip * wip * wip * wip * wip * less wip * less wip... * api/v3 * wip * wip * use ELASTIC_PASSWORD consistently * fix: create "search" alias on initial setup * wip * wip * wip * wip * wip-migrations * wipwip * wip (with piri tests) * wip (partial digestion) * wip (vocab, browse, sharev2) * wip (derive) * wipwip * wip * wip * wip (with Indexcard) * wip (with trove_iris index) * wip * wip (no openended, no fuzzy highlight) * wip (trove_iris cardsearch) * wip * wip * wip * drop models: IngestJob, RegulatorLog * wip * wip (broken) * wip (less broken) * wip (identifier-based index) * wip (populate identifier-based index) * wip (deletable indexcards) * fix: no break on no query * wip (fix path indexing) * wipwip * wip (multi-phase backfill) * wipwip * wip (working value-search) * wip (stabler sharev2_elastic backcompat) * dockerfile: pin to bullseye * restore unsafe settings, for the moment * wip (valuesearch results as indexcards) * text not iri * osfmappy property paths * legacy sharev2 blank node hackery * slightly better sharev2-blanknode hackery * pay a toll for lack of tests * enable bingest * reduce deadlock likelihood * avoid mismatched iri * smaller valuesearch * legacy-sharev2 subjects * fix broken harvest * better search-admin experience * shorter paths; capped keywords * cardsearch sort * fix schedule_harvests error * is ok to search a still-filling index * add periodic tasks back to admin * index both full-iri and suff-uniq paths * fix schedule_harvests (again) * fix: handle multi-word queries * add 'api/v3/' synonym to 'trove/' * more focused default text search * cleaner legacy_sharev2 extract * osfmap updates * loosen query-param name restrictions * tidy up * fix local card_iri * harvester.key -> harvester_key * remove SHARE_API_URL setting * add suid info to indexed indexcard * osfmap metadata update (subjects, collection) * accept "sort=-relevance" same as no sort * count skos labels as labels * avoid duplicate indexcards * add backfill rate to admin/search-indexes * static-ish relatedPropertysearch * suggest property iris in consistent order * fix valueSearchText query * remove unhelpful expense * allow prefixed shortnames for iri params * tidy * make relatedPropertySearch to-one * pagination links * fix relatedPropertySearch (add intermediate search-result) * relatedProperties (not relatedPropertySearch) * filterValueSet with value info * better handling unknown filter value * daemon debug logs * tidy digestive messaging * keep the ontology special-case special * plac8 flake8 * nicer rdf in admin interface * fix: deletion via legacy is_deleted * fix some tests * fix legacy_sharev2 extract * fix legacy_sharev2 extract for real this time * fix: digestive_tract.extract return value * better handle "sort by relevance to nothing" * stop causing index refresh during backfill * fix failing tests (and pull some threads) * fix some tests (wip) * add counts for related properties * daemon never lived * remove debuggin change * fix sharev2_elastic5 tests(?) * wip/tmp * support valuesearch on date property * cardSearchFilter[date]=2023 * reverse order for value-search on dates * stop mis-mapping dateWithdrawn * value-search on date: exclude empty years * fix: deleting records for a suid includes cards * add display label to osfmap properties * add filter operators: is-present, is-absent e.g. `cardSearchFilter[funder][is-present]` * stop putting osfmap types on legacy data * always show queue size/rate in admin/search-indexes * remove dead/unused code and concepts * related-property-path * set label label to "displayLabel" * fix: make sure deleted indexcards are de-indexed * better de-index of data for disabled sourceconfig * de-index items without title/name/label * fix: sharev2_elastic withdrawn should be bool * fix: fill sharev2_elastic `type` for rdf-ingested cards * skip storing all child tasks -- can be a long list * tidier message logging * sharev2_elastic: |-delimited subject lineage * fix: include redundant references in jsonld * lil nicities * remove hasVersion from suggested properties * prevent duplicates between v2_push and rdf tracts * indexcard-based oaipmh * fix: allow empty operator (`cardSearchFilter[foo][]`) * fix: separate old (sharev2-push) and new (trove-rdf) * admin: display raw datum in <pre> * fix: accept fileName as sufficient for indexing * fix: actually update datestamp when a RawDatum is seen again * add "affiliation" to suggested preprint properties * preprint affiliations are implicit via creator * fix tests * skip indexing invalid dates * fix sentry `capture_message` usage * fix more tests * add search_params tests * support "nonurgent" query param on ingest
- Loading branch information
Showing
161 changed files
with
13,792 additions
and
2,528 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
FROM python:3.10-slim as app | ||
FROM python:3.10-slim-bullseye as app | ||
|
||
RUN apt-get update \ | ||
&& apt-get install -y \ | ||
|
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.