- The used dataset is the mongodb dump, so it's easy to import.
- Thanks to the bson data storage format, there is no need to use an ORM, which means fewer components and lower development costs.
- Flexible document schemas, so there is no need to develop the database structure, while data validation remains.
papers._iddefault index;papers.titleto search papers by title;papers.abstractto search papers by abstract;papers.keywordsto search papers by tags;papers.authors.nameto search papers by authors;papers.venue.rawto search papers by venue names.
We found the following fields and data models useful.
| Field name | Type | Meaning |
|---|---|---|
| name | str | author's name |
| org | str | org's name |
| Field name | Type | Meaning |
|---|---|---|
| raw | str | venue's name |
| publisher | Optional(str) | publisher's name |
| Field name | Type | Meaning |
|---|---|---|
| _id | ObjectId | paper id |
| title | str | paper name |
| authors | Optional[List[Author]] | lift of authors |
| venue | Optional[Venue] | paper venue |
| year | int | year of writing |
| keywords | Optional[List[str]] | list of keywords for the paper |
| fos | Optional[List[str]] | paper fields of study |
| n_citation | Optional[int] | citation count |
| lang | Optional[str] | paper language |
| doi | Optional[str] | paper doi |
| abstact | Optional[str] | paper abstract |
| Field name | Type | Meaning |
|---|---|---|
| _id | ObjectId | paper id |
| tag | int | paper tag |
git clone https://github.com/Volodimirich/MadeFinalProject.gitMADE_PATH=${PWD}/MadeFinalProject ./MadeFinalProject/scripts/download-main-data.shfor downloading and processing dataset (~11min duration)cd MadeFinalProject- Possible runs:
docker-compose up app db test-seed tag-seed rec-data-seedrun app with import test dataset;docker-compose up app db main-seed tag-seed rec-data-seedrun app with import main dataset (need run 2. script);- add
mongo-expressto run mongodb web interface; - add
grafanato run monitoring.
http://127.0.0.1:8000Homepage;http://127.0.0.1:8000/docsSwagger UIAPI methods.
Prometheus was chosen as the monitoring system.
Prometheus can be founded on next address - http://127.0.0.1:9090
The following command is required to verify access to the database:
http://127.0.0.1:9090/targets?search=. After redirecting you should
check availability of http://mongodb-exporter:9216/metrics.
As a dashboard Grafana was chosen. To use it you should do next steps:
- Go to
http://127.0.0.1:3000(login: admin, password: pass@123). - In opened window go to Datasources -> Prometheus. In the new window set next url -
http://prometheus:9090. After that press on Save & test button and after the success message appeared press on Back button. - In left panel choose Dashboard, in the pop-up window select +Import button. In new window set and load next code - 2583. And after that select prometheus and press andother load button.