Conversation
- Add container and script for creating the SQL file from the Wikipedia dump and uploading it to an S3 bucket - Add container and script for downloading the SQL file and recreating the dump in a PostgreSQL database - Add versioning based on the SHA256 checksum of the SQL file to force-recreate the database when a newer version exists - Configuration files for rclone (rclone.conf) containing secrets are missing from this commit
c3affae to
925d090
Compare
Collaborator
Author
|
3f4f93b to
8575665
Compare
Collaborator
Author
|
K8s config:
|
Collaborator
|
This PR is meant to incorporate Human-in-the-Loop in the KG generation process. It adds the following main features:
Next steps:
|
Collaborator
|
A new volume has been added, |
9762036 to
78447b4
Compare
Collaborator
|
@ralf-berger when would be a good time to merge this branch into main, considering the changes in the deployment that need to be made? |
Collaborator
Author
Should be ready, albeit completely untested. Feel free to go ahead with the merge. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What is PR trying to achieve?
This PR is meant to incorporate Human-in-the-Loop in the KG generation process, as well as greatly increase processing speed and reduce system requirements for the generation and expansion of knowledge graphs by using a database containing structured data of every Wikipedia article.
How does this PR implement it?
It adds the following main features:
coursemapper-kg-worker-concept-mapservice into 3 services:coursemapper-kg-worker-concept-map,coursemapper-kg-worker-modify-graph, andcoursemapper-kg-worker-expand-material.