Bug 1013413 - Allow updating validator hashes without redeployment, and ... #445

kmaglione · 2015-02-06T01:24:31Z

This depends on the related validator pull request which allows it to accept these checksums.

Some notes on particular changes:

When I tried to test these changes with DRF enabled, I came across a lot of brokenness. I think I wound up fixing a lot of code that only applies to three-legged auth and isn't actually used, and then fixed a few things that applied to three-legged auth and broke during my testing. I also left in some code for switching between three-legged and two-legged auth that can be removed if that's not something we'll ever be doing.

In any case, since DRF is currently switched off, if those changes significantly complicate matters and might delay the review, I'll move most of them to a separate PR.

I also made some changes in api/views.py, namely APIView.__call__ that were necessary for an earlier implementation of my code, but aren't now. I'd be happy to remove them, if they're objectionable, but I think they're an improvement, so I've left them.

I think the rest of this is pretty straightforward. The main changes are:

App and checksum metadata are now passed to the validator directly as python objects, rather than being written to intermediate files.
Checksum data now has a public page letting developers know what is and isn't supported.
Checksum data can be fetched and update via the API. I decided to go this route, rather than an admin page, mainly so we can update libraries routinely without requiring much manual process, but also because Marketplace has been going down the route of converting admin pages to API calls. I'm also considering adding a utility to the validator to fetch production checksums from AMO for local use.

mstriemer · 2015-03-03T20:34:03Z

apps/api/middleware.py

@@ -40,15 +40,25 @@ def process_request(self, request):
        if not hasattr(request, 'authed_from'):
            request.authed_from = []

+        three_legged = False


Mind putting this in settings instead?

mstriemer · 2015-03-03T21:21:24Z

Many test failures looking for libraries.json. I'm assuming because this isn't pointing at the right amo-validator version.

kmaglione · 2015-03-03T21:26:02Z

Yeah. I ran the tests against the correct validator version locally, but since we moved it to PyPI I can't update the version in this repo until that gets published.

mstriemer · 2015-03-03T21:30:55Z

I'm not crazy about the API being one endpoint to set all of the data. I think it would be a lot nicer to have a FileHash model or something that has id, framework, version, filename, banned, hash columns or whatever seems useful. Using a model and accepting one row at a time should remove the complexity of the ChecksumsForm class which is doing a lot of manual type checking.

Pulling the data back out of the db in the right format might add a few seconds of processing time but we can deal with that if it is too slow.

mstriemer · 2015-03-03T21:39:01Z

Can you update requirements/prod.txt to have -e git+https://github.com/kmaglione/amo-validator@framework-stuff#egg=amo-validator instead of amo-validator==<version> for now so we can see the tests pass?

kmaglione · 2015-03-03T22:10:51Z

I thought about going the full DB model route, and there were a few reasons I chose not to:

I want most of this process to be able to happen with as little user-interaction as possible, and without requiring a front-end. This could still be done with a more complicated API, but I think it would become significantly more complicated.
The automatic library fetch/update code I wrote for the validator needs to have direct access to the current metadata. We could always fetch the Olympia-generated JSON before running it, but I'm not especially happy about the idea.
I want the update process to be able to update local metadata without regard for the server, so that we can test changes, for instance, or add local hashes for special cases. Reliably syncing these changes to Olympia after they've been tested, or using separate code paths for testing, is going to get extremely complicated.

I also designed the metadata format to be fairly flexible. It currently allows several other properties, such as the list of files in an XPI that belong to the framework, to be edited, and I'd like to be able to extend it as necessary without significant updates to Olympia.

As for the validation code, it could probably be made simpler, and it could arguably be removed. There should only ever be two or three people who even have access to the API, and the data should always be generated by code in the validator repository which can be expected to be reliable.

kmaglione · 2015-03-03T22:17:42Z

Updated amo-validator requirement for tests.

kmaglione · 2015-03-03T23:12:37Z

Fixed flake8 errors, but I disagree with ./apps/api/forms.py:24:53: E226 missing whitespace around arithmetic operator. PEP8 suggests leaving no whitespace around higher precedence operators in expressions with mixed precedence.

…nd publish known libraries in readable format.

magopian · 2015-06-30T09:50:25Z

apps/api/forms.py

+                                    required=True)
+
+    def clean_checksum_json(self):
+        # Do basic schema validation.


Should that be a docstring?

Maybe... It's a field cleaning function, so its purpose is fairly clear. I'm not sure there's any benefit to exposing this particular detail in a docstring.

magopian · 2015-06-30T18:03:48Z

I believe this PR is linked/needs mozilla/amo-validator#271 ? However, I don't think it contains code to post to the API in here, does it?

Actually, I'd like to avoid the whole API thing, and instead have a task on olympia that will:
1/ call the lib updater (taken out from the validator, and released separately?) by providing it the current library metadata from the database
2/ get the updated metadata fromt he lib updater, and update the database with it
This way, the validator is just a service that will take the needed input (the apps and library metadata) and return the validation. No more back and forth via local files or API.

Also, I second @mstriemer: storing the data in a full model is the way to go. We don't want to have to do type checks ourselves, it's error prone. And if we need to extend the model from time to time, we'll use migrations as we do for all the other models we already have. We do not want custom/home-made solutions when there is a good and idiomatic way to do it.

Regarding the unnecessary changes, or the changes that are not linked to this bug, please remove them or move them to other PRs.

I really like the idea of having a public page listing everything we support. This is an awesome idea. Passing the apps and library metadata straight to the validator is also excellent, I want that. Thanks!

kmaglione · 2015-07-07T02:26:54Z

I believe this PR is linked/needs mozilla/amo-validator#271 ? However, I don't think it contains code to post to the API in here, does it?

It needs it for the validator schema changes, yes. The code to post to the API is in a different repo for the moment.

Actually, I'd like to avoid the whole API thing, and instead have a task on olympia that will:
1/ call the lib updater (taken out from the validator, and released separately?) by providing it the current library metadata from the database
2/ get the updated metadata fromt he lib updater, and update the database with it

We need more control over these than a cronjob would allow. Many of the checksums are manually added only after review, and it's important that we be able to change what libraries/versions we support without having to deploy new code to production. This has been a huge problem going back several years, which is why we designed this to allow us this kind of flexibility.

For the moment, I think it would be OK to split the API changes into a separate PR, and rely on a management command to update the data for the initial deployment. I don't want to move the data updates out of direct control of a human, though. Once we've been using the library fetcher for a while, and we're satisfied with it, it might be a good idea to add a server-side cronjob, just as long as manual updates remain possible.

Also, I second @mstriemer: storing the data in a full model is the way to go. We don't want to have to do type checks ourselves, it's error prone. And if we need to extend the model from time to time, we'll use migrations as we do for all the other models we already have. We do not want custom/home-made solutions when there is a good and idiomatic way to do it.

The problem with converting this into models is that this is a deep and complex data structure, which is mainly used by code outside of this codebase, without access to SQL or django's ORM. Just constructing the models is going to be a lot of work, and serializing them to JSON, and ensuring that doing so does not cause problems, is going to be even more work, and much more error-prone.

It would also mean that the library fetcher code would always have to run on a server with direct access to the AMO DB, which is something I definitely do not want, since it makes updates a much heavier-weight process, when the entire purpose of this change is to make it lighter-weight. It also makes it much more difficult for people running local copies of the validator to maintain their own checksums DBs, since running updates would now require an initialized database and a full django ORM to make any changes and convert them back to JSON.

I don't think we gain anything significant from building out an entire ORM model tree just to avoid a couple of dozen lines of basic JSON schema validation. I'd be happy to move the validation code to the validator itself, since it is essentially the owner of the data structure.

kmaglione · 2015-07-08T03:59:30Z

Here's the rough version of the client code for updating hashes via the API: https://github.com/kmaglione/amo-oauth

The last commit is all that differs from Andy's client.

andymckay · 2015-10-23T16:51:20Z

Closing for now, it can't be merged cleanly. If its going to be worked on, please re-open.

kmaglione force-pushed the framework-stuff branch from 5685f6e to 9312f26 Compare February 6, 2015 01:43

mstriemer reviewed Mar 3, 2015
View reviewed changes

kmaglione force-pushed the framework-stuff branch from 9312f26 to 29069df Compare March 3, 2015 22:17

kmaglione force-pushed the framework-stuff branch from 29069df to 6ae36d2 Compare March 3, 2015 23:10

kmaglione added 2 commits March 4, 2015 14:03

Bug 1013413 - Allow updating validator hashes without redeployment, a…

7a2a472

…nd publish known libraries in readable format.

FIXME. Temporarily override amo-validator requirement.

bc074d7

kmaglione force-pushed the framework-stuff branch from 6ae36d2 to bc074d7 Compare March 4, 2015 22:03

magopian reviewed Jun 30, 2015
View reviewed changes

andymckay closed this Oct 23, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug 1013413 - Allow updating validator hashes without redeployment, and ... #445

Bug 1013413 - Allow updating validator hashes without redeployment, and ... #445

kmaglione commented Feb 6, 2015

mstriemer Mar 3, 2015

mstriemer commented Mar 3, 2015

kmaglione commented Mar 3, 2015

mstriemer commented Mar 3, 2015

mstriemer commented Mar 3, 2015

kmaglione commented Mar 3, 2015

kmaglione commented Mar 3, 2015

kmaglione commented Mar 3, 2015

magopian Jun 30, 2015

kmaglione Jul 7, 2015

magopian commented Jun 30, 2015

kmaglione commented Jul 7, 2015

kmaglione commented Jul 8, 2015

andymckay commented Oct 23, 2015

Bug 1013413 - Allow updating validator hashes without redeployment, and ... #445

Bug 1013413 - Allow updating validator hashes without redeployment, and ... #445

Conversation

kmaglione commented Feb 6, 2015

mstriemer Mar 3, 2015

Choose a reason for hiding this comment

mstriemer commented Mar 3, 2015

kmaglione commented Mar 3, 2015

mstriemer commented Mar 3, 2015

mstriemer commented Mar 3, 2015

kmaglione commented Mar 3, 2015

kmaglione commented Mar 3, 2015

kmaglione commented Mar 3, 2015

magopian Jun 30, 2015

Choose a reason for hiding this comment

kmaglione Jul 7, 2015

Choose a reason for hiding this comment

magopian commented Jun 30, 2015

kmaglione commented Jul 7, 2015

kmaglione commented Jul 8, 2015

andymckay commented Oct 23, 2015