Skip to content

Conversation

HazAT
Copy link
Member

@HazAT HazAT commented Feb 1, 2018

This adds POST /api/0/projects/sentry/internal/files/difs/assemble/ API endpoint to Sentry.

sentry-cli

tl;dr
This is part 2 #7095
This enables us the upload dif (Debug Information Files) of arbitrary size.
95%+ of lines is the added crash.sym fixture for tests

This endpoint pieces together chunks that were uploaded before in the mentioned PR.
It adds a new model called FileBlobOwner which makes sure that someone who uploads a blob has access rights to it (Organization).

The request is JSON scheme validated.

Request
POST /api/0/projects/sentry/internal/files/difs/assemble/
Body:

{
	"38fbd8b2cbe56884115e324dd5f2a10c8201450c": {
		"type": "dif",
		"name": "test.dsym",
		"params": {
			"project": "java",
		},
		"chunks": [
			"38fbd8b2cbe56884115e324dd5f2a10c8201450c"
		]
	}
}

This actually tries to assemble a file if all chunks were uploaded before.
If chunks are missing (or ownership is missing) the response will be:

{
    "38fbd8b2cbe56884115e324dd5f2a10c8201450c": {
        "state": "not_found",
        "missingChunks": ["38fbd8b2cbe56884115e324dd5f2a10c8201450c"]
    }
}

If all chunks are already uploaded and the file did not exsist before response will be:

{
    "38fbd8b2cbe56884115e324dd5f2a10c8201450c": {
        "state": "created",
        "missingChunks": []
    }
}

State can be:

ChunkFileState = enum(
    OK='ok', # File in database
    NOT_FOUND='not_found', # File not found in database
    CREATED='created', # File was created in the request and send to the worker for assembling
    ASSEMBLING='assembling', # File still being processed by worker
    ERROR='error' # Error happened during assembling
)

This will trigger the task sentry.tasks.assemble.assemble_chunks to do the actual assembling.

The assemble task supports difs (Debug information files) like dsyms and so on.
It does not currently support proguard and source map files.

Legacy **Request** `POST` `/api/0/chunk-assemble/` Body: ```json { "38fbd8b2cbe56884115e324dd5f2a10c8201450c": true } ```

This will check the File model if a file with this checksum already exists in the database.

Response:

{
    "38fbd8b2cbe56884115e324dd5f2a10c8201450c": {
        "state": "ok",
        "missingChunks": []
    }
}

@HazAT HazAT self-assigned this Feb 1, 2018
@ghost
Copy link

ghost commented Feb 1, 2018

2 Warnings
⚠️ You should update CHANGES due to the size of this PR
⚠️ PR includes migrations

Migration Checklist

  • new columns need to be nullable (unless table is new)
  • migration with any new index needs to be done concurrently
  • data migrations should not be done inside a transaction
  • before merging, check to make sure there aren't conflicting migration ids

Generated by 🚫 danger

@HazAT HazAT force-pushed the feature/dsym-assemble branch from 93549ad to 171b6c5 Compare February 5, 2018 15:30

return Response(
{
'url': '{}{}'.format(endpoint, reverse('sentry-api-0-chunk-upload')),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only thing that changed here.
We want to return the full url instead of just the "domain"

CELERY_QUEUES = [
Queue('alerts', routing_key='alerts'),
Queue('auth', routing_key='auth'),
Queue('assemble', routing_key='assemble'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @JTCunning I'm not sure if we need to do anything special anymore from ops to handle a new queue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we're good. If the task takes up a significant amount of resources, we'll isolate it with another pool of workers.

Copy link
Contributor

@mattrobenolt mattrobenolt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just blocking this until we add verification to checksums like we discussed offline.

The type is a File.ChunkAssembleType
'''
if len(file_blob_ids) == 0:
logger.warning('sentry.tasks.assemble.assemble_chunks', extra={
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove all of your log statements that are prepended with sentry.tasks.assemble since that will be in the logger name and is unnecessary.

file.assemble_from_file_blob_ids(file_blob_ids, checksum)
if file.headers.get('state', '') == ChunkFileState.ERROR:
logger.error(
'sentry.tasks.assemble.assemble_chunks',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you have multiple assemble_chunks error statements throughout your code, you should append them with the logical reason they're erroring, so assemble_chunks.state_error or something.

@mitsuhiko
Copy link
Contributor

I think we should kill mode 1 and always require the chunks and metadata to be sent (as a file with the same checksum might exist with other parameters). Additionally we will need to org scope the chunk upload for security reasons as discussed on slack.

I'm fine storing a chunk-verified bit in cache for 12 hours which should also be our "chunk not stable" time. David also says we can keep a huge table. Either works I think.

@mitsuhiko mitsuhiko closed this Feb 5, 2018
@mitsuhiko mitsuhiko reopened this Feb 5, 2018
@mitsuhiko
Copy link
Contributor

What we discussed on slack:

  • store org chunk ownership in a table (TODO: consider if we want to do this for internal writes too or only external chunk upload)
  • later introduce shared dsym files again (separate model, shared_dsymfile) to avoid users having to upload common files multiple times. Also consider symbolserver here again
  • move the endpoint to org level

Notes unrelated to above convo:

  • remove sha1 only file assemble, require meta info.
  • make assemble fully dsym specific (eg: move it into a project url)

@jan-auer
Copy link
Member

jan-auer commented Feb 5, 2018

Would be great if we could return the error description in /chunk-assemble/, if any. Right now, there's no convenient way to retrieve it via the new endpoint.

@HazAT
Copy link
Member Author

HazAT commented Feb 6, 2018

I've added a new model called FileBlobOwner which checks if someone from the same org already uploaded the chunk.
It wasn't that trivial after all the handle all possible scenarios but the tests should cover it and we tested it in conjunction with what @jan-auer did already with sentry-cli 🎉
(see gif in the first comment)

@HazAT
Copy link
Member Author

HazAT commented Feb 6, 2018

@mattrobenolt any new feedback?

Copy link
Contributor

@mitsuhiko mitsuhiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some specific comments.

Generally though I think we should really make this a DIF specific endpoint for now (eg: on a dif specific url instead of generic chunk-assemble). Reason being that it seems cleaner from an access management point of view and fits better into how current code functions.

)
except IntegrityError:
pass
if blob.checksum not in checksum_list:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this loop to be a for checksum, chunk in izip(checksums, files) and then check the blob.checksum against the checksum directly?

for owned_blob in all_owned_blobs:
owned_blobs.append((owned_blob.blob.id, owned_blob.blob.checksum))

# If the request does not cotain any chunks for a file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "cotain"

elif len(owned_blobs) != len(chunks):
# Create a missing chunks array which we return as response
# so the client knows which chunks to reupload
missing_chunks = list(chunks)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this into missing_chunks = set(chunks) and then remove items with missing_chunks.discard(blob[1]). Faster and easier (O(1) vs O(n something))


def forwards(self, orm):
# Adding index on 'File', fields ['checksum']
db.create_index('sentry_file', ['checksum'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't run this in production.

This needs to be done with CREATE INDEX CONCURRENTLY. grep the repo for examples of other migrations adding indexes.


# Flag to indicate if this migration is too risky
# to run online and needs to be coordinated for offline
is_dangerous = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because creating this index will take a while, especially with CONCURRENTLY, this needs to be flipped to True so we don't block deploy for however long it takes to create the index.

@HazAT HazAT changed the title feat: Add assemble endpoint feat: Add dif assemble endpoint Feb 7, 2018
Copy link
Contributor

@mitsuhiko mitsuhiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell this is good to go from my side. Annoyingly we can't push this to staging because of the migration.

@HazAT
Copy link
Member Author

HazAT commented Feb 7, 2018

Also need @mattrobenolt seal of approval.

Copy link
Contributor

@mattrobenolt mattrobenolt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking for the prefetch_related comment and make sure the migration doesn’t need to be rebased.

def _check_file_blobs(self, organization, checksum, chunks):
files = File.objects.filter(
checksum=checksum
).select_related('blobs').all()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you’re looking for prefetch_related here. Otherwise, below you’re doing an O(n) for each file to fetch the blobs.

name=name,
checksum=checksum,
type='chunked',
headers={'state': ChunkFileState.CREATED}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not a fan of using a header here to manage the state. Is this file truly temporary?

Might I suggest a name like, __state to better signal that it’s not real?

@mattrobenolt
Copy link
Contributor

Migration needs debased over #7191

@HazAT HazAT merged commit 6088008 into master Feb 8, 2018
@HazAT HazAT deleted the feature/dsym-assemble branch February 8, 2018 19:09
@github-actions github-actions bot locked and limited conversation to collaborators Dec 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants