Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use encodings from db instead of minio #522

Merged
merged 14 commits into from
Mar 16, 2020
Merged

Conversation

hardbyte
Copy link
Collaborator

@hardbyte hardbyte commented Mar 3, 2020

Builds ontop of PR that fixes Jaeger: #523

This PR completes the switch from storing encodings in minio to postgres. It makes several short term assumptions such as assuming there will be one default block per data provider and that encoding ids will be sequential non-repeating integers. These assumptions will be addressed over the next several sprints.

Chunking information previously included object store filenames, now each chunk comprises a data provider ID and a range of encoding ids. E.g. dp_id=45, range=[2000,5000], the comparison task now queries the database to fetch these encodings (currently ignoring blocks).

Tracing child spans was not working inside of celery tasks so I've fixed the tasks I touched.

@hardbyte hardbyte requested a review from wilko77 March 3, 2020 03:32
@hardbyte hardbyte force-pushed the feature-use-encodings-from-db branch from b7781f8 to 631aa67 Compare March 3, 2020 22:29
# retrieval of encoding ids should be much faster than insertion
assert fetch_time < elapsed_time

assert id_fetch_time < elapsed_time
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these timing assertions might not always hold. (e.g. if there is a load spike in the db) So should we really fail if it goes wrong once?


parent_span = g.flask_tracer.get_span()

def precheck_encoding_upload(project_id, headers, parent_span):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring would be nice

@hardbyte hardbyte merged commit 1a37280 into develop Mar 16, 2020
@hardbyte hardbyte deleted the feature-use-encodings-from-db branch March 22, 2021 01:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants