This repository was archived by the owner on Jan 29, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 12
This repository was archived by the owner on Jan 29, 2024. It is now read-only.
Add Elasticsearch articles table #618
Copy link
Copy link
Closed
Labels
🗄️ databaseCreation and maintenance of a database of scientific literatureCreation and maintenance of a database of scientific literature
Description
Context
- In the past, we had a SQL database with the two tables
sentencesandarticles.sentencescontained an article sentence per row, plus a field (FOREIGN KEY) allowing to identify inarticlesfrom which article the sentence was coming from.Search/src/bluesearch/entrypoint/database/schemas.py
Lines 55 to 56 in f77d8c1
Table( "sentences", articlescontained metadata info on a article per row (e.g. year of publication, journal, ...)Search/src/bluesearch/entrypoint/database/schemas.py
Lines 36 to 37 in f77d8c1
Table( "articles",
- After Semantic search + Exact matches in ElasticSearch #610 we decided to use Elasticsearch instead of a SQL db to store our texts (i.e.
sentencestable). - Moreover, Question-Answering systems need to work with contexts longer than a single sentence, so rather than
sentenceswe want now aparagraphtable (but still with aFOREIGN KEYlinking each row to the article it comes from). - After moving the
paragraphtable to Elasticsearch, it becomes natural to have also thearticletable on Elasticsearch, rather than on a separate SQL database.
Actions
- Create an
articlestable on Elasticsearch, with the same schema (here) of the SQL table we had until now. - Perform some testing to make sure that it is possible to perform joins in Elasticsearch, i.e. retrieve the article rows from
articlestable given thearticle_ids of some rows inparagraphs, and vice versa retrieve all paragraphs given somearticle_ids.
Metadata
Metadata
Assignees
Labels
🗄️ databaseCreation and maintenance of a database of scientific literatureCreation and maintenance of a database of scientific literature