Nipsa #2300

Merged
merged 34 commits into from Jul 22, 2015

Projects

None yet

4 participants

@seanh
Contributor
seanh commented Jun 8, 2015

No description provided.

@seanh seanh added the WIP label Jun 8, 2015
@seanh seanh self-assigned this Jun 8, 2015
@seanh
Contributor
seanh commented Jun 12, 2015

I'm less chuffed with how views_test.py turned out. models_test.py was really clean and simple, possibly because models.py hardly does anything anymore, but it had no patching and was always testing the inputs, outputs and side-effects not testing implementation.

views_test.py: Loads of detailed patching and mocking of lots of different things.

Of course this has to be some patching and mocking in view code but it feels like too much of it to me.

Session and NipsaUser need to be patched in every test (even tests that don't use the mocks) or the tests will be unnecessarily accessing the db.

The tests are pretty repetitive - largely the same tests repeated four times for the index(), create(), delete() and read() functions.

Leaves me wondering if it could be factored differently to make the tests better. But maybe they're ok.

I think some clarity can be added with a few more test function docstrings and comments.

@seanh
Contributor
seanh commented Jun 12, 2015

Reminder to self: these view callables need tests for missing and invalid params

@tilgovi
Contributor
tilgovi commented Jun 14, 2015

@seanh I'm not picky about imports in tests at all, if you were asking that.

@seanh
Contributor
seanh commented Jun 15, 2015

@nickstenning @tilgovi I think we got into a confusion about what import style we had actually agreed on. Or at least it certainly confused the hell out of me. I thought that in our email thread about it that we had agreed on import package, import module and from package.subpackage import module but then Nick dinged me for import pyramid_basemodel and said we had agreed to import symbols from within modules directly (from pyramid_basemodel import Session etc).

I think we need to know what we agreed on here so we don't keep butting heads about it / I need to know what we're doing so I don't get told off.

Can I suggest we do what we probably should have done in the first place and just follow what the Google Python Style Guide says about it? I just realised that it's in there, and by coincidence (or maybe not so coincidental) it says to do exactly what Randall was arguing for in the email thread:

  • import json (for package/module names with no dots in them)
  • from package import module, from package.subpackage import module, etc the rest of the time
  • Import packages and modules only, not symbols
  • No relative imports
  • One import per line

I think this is also a good compromise between what I originally wanted to do (import everywhere and no from, max context wherever an imported name is used) vs always importing symbols (max conciseness) - it gives some context and is pretty concise.

@nickstenning
Member

Can I suggest we do what we probably should have done in the first place and just follow what the Google Python Style Guide says about it?

That's fine with me. Could you start a styleguide repository in the hypothesis org and put what you want to codify in there? I'd much rather it was explicitly documented somewhere?

@seanh
Contributor
seanh commented Jun 15, 2015

I actually found it here https://h.readthedocs.org/en/latest/hacking/code-style.html though if we want to have an org-wide style guide, maybe we should move that to a style guide repo?

@tilgovi
Contributor
tilgovi commented Jun 15, 2015

Happy to use hypothesis/vision for this sort of thing.

@seanh
Contributor
seanh commented Jun 15, 2015

Reminder to self: these view callables need tests for missing and invalid params

Ooh, actually I don't think it's possible to pass an invalid param :)

@seanh
Contributor
seanh commented Jun 16, 2015

TODO:

  • Rebase on latest master (significant changes)
  • Tests for a few parts of the already-written code
  • Add an annotations API db, instead of using the app's one. The work done in #2320 will be useful
  • Add the concept of admin users to h including a way to create the first admin user, an api for managing admin users and a way for the accounts api to send the info that a user is an admin to the annotations api. This should probably be done in a separate pull request and merged first. See this email thread about it https://groups.google.com/a/list.hypothes.is/forum/#!topic/dev/sUZbfkXSw2o
  • Make it so that only admin users can use the nipsa api
  • Probably some import styles to fix
  • Fix landscape's issues
  • Remove the nipsa flag from the user table in the db (not using this)
@seanh
Contributor
seanh commented Jul 1, 2015

Another email thread with ideas for this feature: https://groups.google.com/a/list.hypothes.is/forum/#!topic/dev/PlCkND_1YdI (we should implement a generic filtering feature in the API, then implement NIPSA entirely in the app)

@seanh
Contributor
seanh commented Jul 13, 2015

Just rebased this onto master

@seanh
Contributor
seanh commented Jul 17, 2015

@tilgovi @nickstenning This is finally ready for review now. Brief summary of how this works:

  • Admins can access an HTML page at /nipsa for nipsa'ing and unnipsa'ing users
  • For ease of use this page accepts usernames like "seanh" instead of ids like "acct:seanh@hypothes.is". What is actually stored in the nipsa table in the db though are the full ids. The id is constructed from the submitted username using request.domain. Note that the user whose id we're constructing is the user to be nipsa'd, not the authorized admin user who is submitting the form. So I'm not sure whether using request.domain is correct here. The alternative would be to require users to type the full "acct:seanh@hypothes.is".
  • The NIPSA API provides logic functions that the HTML page imports and calls. There are no HTTP API endpoints for NIPSA / it isn't exported as part of our public HTTP API. Arguably we should export a JSON API for it - because that would make NIPSA a complete usable feature in the API, without the app. Currently you need the app's HTML page to use it. But we don't need an HTTP API now so I've left it out.
  • When you nipsa or unnipsa a user a worker function is kicked off to add the nipsa flag to or remove the flag from all of the user's past annotations. This uses Es's scan and bulk update which is the most efficient way to do this (recommended by Es docs).
  • When a nipsa'd user creates a new annotation the nipsa flag is added to the annotation at create time.
  • The search API now does filtered Es queries. All nipsa'd annotations are filtered out, but if there's an authorized user then all of their annotations are filtered in so nipsa'd users can see their own annotations but no one else can see them.
  • Filtering is how the Es docs recommend to do this sort of thing - it is very efficient as documents can be filtered out before starting the query (before ranking and scoring and everything). Also it means that Es features such as limits and offsets continue to work against the remaining annotations.
  • So "public site areas" is defined as the search API, meaning nipsa'd annotations won't appear in the stream or sidebar for example.
  • The app/API split is respected by storing the nipsa flags in their own database table belonging to the API, instead of in the app's user table. This means there's no need to communicate the nipsa'd state from app to API: the API simply checks its own model.
  • I've got h/api/nipsa/models.py simply importing h.db in order to use db.Base, but this means that the API is importing code from the app. Note sure how you want to do this.

This implementation is pretty much what I outlined before I started coding (a month and a half ago!). Despite the landscape having changed in the meantime, I still think this implementation has a lot to recommend it over the many alternatives that have been suggested:

  1. It's probably the simplest and most straightforward way to implement nipsa
  2. I think flagging and filtering out not-safe-for-work content is a sensible feature to have in our API, that many sites/web apps/desktop or mobile apps that might use our API might want
  3. The code is clean. In particular, it's almost all in h/api/nipsa/ except for a couple of one-liners elsewhere in the API, and an HTML page in the app. The existing code is hardly touched at all by NIPSA.
  4. It doesn't introduce any new abstract concepts such as a "public" sphere etc which may turn out to be premature. I think since we haven't shipped nipsa and haven't even started to implement groups, we don't know how those features will turn out and evolve in response to users, so we don't know if any given abstraction over both of them is a good one yet. No abstraction is better than a bad abstraction.
  5. The way it respects the app/API split is really simple and obvious.
  6. Doesn't do anything like accessing Es directly from the app (a -1, even if not strictly forbidden) or adding "generic" functions to the API just to enable one feature in the app
@seanh
Contributor
seanh commented Jul 17, 2015

@landscape-bot doesn't seem to have been commenting on any of my PRs lately

@landscape-bot

Code Health
Repository health increased by 0.41% when pulling 5e5969f on nipsa into 8798f4c on master.

@nickstenning nickstenning commented on an outdated diff Jul 17, 2015
h/api/nipsa/logic.py
+
+ If the user isn't NIPSA'd then nothing will happen (but an "unnipsa"
+ message for the user will still be published to the queue).
+
+ """
+ nipsa_user = models.NipsaUser.get_by_id(user_id)
+ if nipsa_user:
+ request.db.delete(nipsa_user)
+
+ _publish(request, json.dumps({"action": "unnipsa", "user_id": user_id}))
+
+
+def is_nipsad(user_id):
+ """Return True if the given user is on the NIPSA list, False if not."""
+ user_nipsa = models.NipsaUser.get_by_id(user_id)
+ if user_nipsa:
@nickstenning
nickstenning Jul 17, 2015 Member

return user_nipsa is not None would be more idiomatic here (IMO).

@nickstenning nickstenning commented on an outdated diff Jul 17, 2015
h/api/nipsa/models.py
@@ -0,0 +1,30 @@
+import sqlalchemy
+from sqlalchemy.orm import exc
+
+from h import db
+
+
+class NipsaUser(db.Base):
@nickstenning
nickstenning Jul 17, 2015 Member

So as I mentioned in the document that we discussed yesterday, the API should probably have its own declarative base class.

I'm happy to overlook this for now, but if you do fancy having a stab at this it should be reasonably straightforward. You'd probably just need to add something like the following to h/api/db.py:

from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

def use_session(session, base=Base):
    base.query = session.query_property()

def bind_engine(engine, base=Base, should_create=False, should_drop=False):
    base.metadata.bind = engine
    if should_drop:
        base.metadata.drop_all(engine)
    if should_create:
        base.metadata.create_all(engine)

And then in h/db.py#includeme at an appropriate point add

api.db.use_session(Session)

And then at the end of h/db.py#bind_engine add

api.db.bind_engine(engine, should_create=should_create, should_drop=should_drop)

And hopefully that will give you a distinct Base class you can use, configured with a query property connected to the shared session.

@nickstenning nickstenning commented on an outdated diff Jul 17, 2015
h/api/nipsa/models.py
+class NipsaUser(db.Base):
+
+ """A NIPSA entry for a user (SQLAlchemy ORM class)."""
+
+ __tablename__ = 'nipsa'
+ user_id = sqlalchemy.Column(sqlalchemy.UnicodeText, primary_key=True,
+ index=True)
+
+ def __init__(self, user_id):
+ self.user_id = user_id
+
+ @classmethod
+ def get_by_id(cls, user_id):
+ """Return the NipsaUser object for the given user_id, or None."""
+ try:
+ return db.Session.query(cls).filter(
@nickstenning
nickstenning Jul 17, 2015 Member

There's a query property on the base class so db.Session.query(cls) can be written more simply as cls.query.

@nickstenning nickstenning commented on an outdated diff Jul 17, 2015
h/api/search.py
@@ -31,7 +35,20 @@ def _match_clause_for_uri(uristr):
}
-def build_query(request_params):
+def _es_client():
+ """Return an elasticsearch.Elasticsearch client object."""
+ return elasticsearch.Elasticsearch([{"host": "localhost", "port": 9200}])
@nickstenning nickstenning commented on an outdated diff Jul 17, 2015
h/migrations/versions/534754b1bf54_add_nipsa_table.py
+Create Date: 2015-07-17 12:51:53.582010
+
+"""
+
+# revision identifiers, used by Alembic.
+revision = '534754b1bf54'
+down_revision = '49764e8a7e69'
+
+from alembic import op
+import sqlalchemy as sa
+
+
+def upgrade():
+ op.create_table(
+ 'nipsa',
+ sa.Column('user_id', sa.UnicodeText, primary_key=True, index=True)
@nickstenning
nickstenning Jul 17, 2015 Member

We're consistent in the rest of the code base about this being called userid, without an underscore. It should be the same here (and in the corresponding model).

@seanh
Contributor
seanh commented Jul 18, 2015

Note to self: We may need to re-enable nipsaing of users who don't exist (currently prevented by the html view) so we can hide annotations of user accounts that have been deleted.

@seanh
Contributor
seanh commented Jul 20, 2015

Ah, I'm getting an Es traceback when loading the sidebar when not logged in...

@seanh
Contributor
seanh commented Jul 20, 2015

Still need to fix this bug when you're not logged in...

@seanh
Contributor
seanh commented Jul 20, 2015

Ah, I'm getting an Es traceback when loading the sidebar when not logged in...

Think this was a red herring, it's just an Es error that's already on master #2257

@seanh
Contributor
seanh commented Jul 20, 2015
@seanh
Contributor
seanh commented Jul 20, 2015

@nickstenning Ok, that's everything done.

I ended up adding request.es_client which we could make use of in other places as well.

I also removed checking whether the user exists before nipsaing them, b/c I think the code is better without it (see commit message).

@seanh
Contributor
seanh commented Jul 20, 2015

Oh wait some tests are broken now..

@seanh
Contributor
seanh commented Jul 20, 2015

That should fix it..

@seanh
Contributor
seanh commented Jul 20, 2015

@nickstenning OK now it's ready

@seanh seanh removed the WIP label Jul 20, 2015
@seanh seanh removed their assignment Jul 20, 2015
@tilgovi tilgovi commented on the diff Jul 20, 2015
h/api/nipsa/logic.py
+ :rtype: list of unicode strings
+
+ """
+ return [nipsa_user.userid for nipsa_user in models.NipsaUser.all()]
+
+
+def nipsa(request, userid):
+ """NIPSA a user.
+
+ Add the given user's ID to the list of NIPSA'd user IDs.
+ If the user is already NIPSA'd then nothing will happen (but a "nipsa"
+ message for the user will still be published to the queue).
+
+ """
+ nipsa_user = models.NipsaUser.get_by_id(userid)
+ if not nipsa_user:
@tilgovi
tilgovi Jul 20, 2015 Contributor

I would prefer explicit is None here.

@seanh
seanh Jul 20, 2015 Contributor

https://google-styleguide.googlecode.com/svn/trunk/pyguide.html?showone=True/False_evaluations#True/False_evaluations

The reason for using if nipsa_user is None: seems to be in case nipsa_user is some falsey value other than None, for example in case of a method that may return either None or an empty list, but in this case I know that NipsaUser objects aren't falsey and don't expect get_by_id() to return any falsey values except None so I think Use the "implicit" false if at all possible applies.

@tilgovi
tilgovi Jul 20, 2015 Contributor

I think of the models as being dict-like and while I guess it's not going to be an empty dict because it has keys it still makes it clearer for me to think about explicit non-existence if I see is None.

@tilgovi
tilgovi Jul 20, 2015 Contributor

But I won't start a fight over it.

@nickstenning
nickstenning Jul 22, 2015 Member

I'm also not going to block merging this over an issue as small as this, but as previously expressed I'd much rather see explicit is None checks where that's what's being checked.

@nickstenning
nickstenning Jul 22, 2015 Member

The distinction here (in my head, at least) is between:

  1. Checking for falsiness. This should always be if foo:, not if foo == True: or if foo is True:
  2. Checking for the sentinel value None. This should always be if foo is None:/if foo is not None:

This isn't a check for falsiness, it's a check for a sentinel value.

@tilgovi tilgovi and 1 other commented on an outdated diff Jul 20, 2015
h/api/nipsa/logic.py
+ If the user is already NIPSA'd then nothing will happen (but a "nipsa"
+ message for the user will still be published to the queue).
+
+ """
+ nipsa_user = models.NipsaUser.get_by_id(userid)
+ if not nipsa_user:
+ nipsa_user = models.NipsaUser(userid)
+ request.db.add(nipsa_user)
+
+ _publish(request, json.dumps({"action": "nipsa", "userid": userid}))
+
+
+def unnipsa(request, userid):
+ """Un-NIPSA a user.
+
+ If the user isn't NIPSA'd then nothing will happen (but an "unnipsa"
@tilgovi
tilgovi Jul 20, 2015 Contributor

Why publish the event anyway?

@seanh
seanh Jul 20, 2015 Contributor
  1. It's simpler to just always publish, doesn't do any harm and there's no need to increase the complexity of the logic and add more tests
  2. This way enables re-unnipsa'ing users. For example if I unnipsa a user but the worker process crashes and doesn't get all of their annotations, I just unnipsa them again to try again. This is now a bit of a mute point since the actual HTML UI for nipsa will make you re-nipsa them in order to unnipsa them again anyway! But originally I had a JSON API (and we may want one again one day) and you could re-nipsa and re-unnipsa directly.
@tilgovi
tilgovi Jul 20, 2015 Contributor

Hmmm. If the worker crashes, the message won't be acknowledged, and it will still be in the queue, but I guess I see your point about simplicity of the logic.

@seanh
seanh Jul 20, 2015 Contributor

Well, I guess the point is that a) re-nipsa'ing people doesn't do any harm and b) maybe one day there'll be a reason why we want to re-nipsa someone so why write unnecessary logic that prevents re-nipsa'ing?

This is the same reason why I didn't add logic to prevent you from nipsaing a user id that doesn't exist in the users table - because I can imagine (possibly unlikely) future scenarios where we may want to nipsa users who aren't in the users table, and adding logic to prevent it is just unnecessary

@tilgovi tilgovi commented on an outdated diff Jul 20, 2015
h/api/nipsa/logic.py
+
+def unnipsa(request, userid):
+ """Un-NIPSA a user.
+
+ If the user isn't NIPSA'd then nothing will happen (but an "unnipsa"
+ message for the user will still be published to the queue).
+
+ """
+ nipsa_user = models.NipsaUser.get_by_id(userid)
+ if nipsa_user:
+ request.db.delete(nipsa_user)
+
+ _publish(request, json.dumps({"action": "unnipsa", "userid": userid}))
+
+
+def is_nipsad(userid):
@tilgovi
tilgovi Jul 20, 2015 Contributor

Maybe has_nipsa since NIPSA is more of a flag than a verb.
I would add_nipsa and remove_nipsa instead of nipsa and unnipsa, but that's personal preference, maybe.

@tilgovi tilgovi and 1 other commented on an outdated diff Jul 20, 2015
h/api/nipsa/search.py
@@ -0,0 +1,83 @@
+import copy
+
+
+def nipsa_filter(query, userid=None):
@tilgovi
tilgovi Jul 20, 2015 Contributor

The name of this function is nipsa_filter and it doesn't really change the query, it just nests it inside the filtered query. What if instead this function just returned the filter and its caller could combine the query and the filter to make a filtered query?

@seanh
seanh Jul 20, 2015 Contributor

That sounds like the kind of thing we may need in the future when we have a function for nipsa filtering, a function for group filtering, and a function for combining them.

But for now all logic about Es filters (including how to nest them in a query with a "filtered" dict containing "filter" and "query" keys) is encapsulated in this one function, which is nice. I believe there's no code anywhere else that knows anything about Es filters.

Also I was trying really hard to keep the lines of code that I added outside of h/api/nipsa/ as few as possible.

@tilgovi
tilgovi Jul 20, 2015 Contributor

It would make the rest of the module simpler to read, in my opinion, and it would save you a couple tests like asserting that this doesn't modify the query (that it never needed in the first place).

@seanh
seanh Jul 20, 2015 Contributor

I really think that spreading our knowledge and logic about what the structure of Es queries is and how to construct them in different modules throughout the app is a pretty big negative..

@tilgovi
tilgovi Jul 20, 2015 Contributor

Nonsense. The only places that call this are streamer and search. The streamer code already knows far far too much about Elastisearch. That would only ever change, maybe, if we had it use some totally abstracted search call, which maybe api already exposes well enough. And search is already constructed the query object anyway in order to pass that in.

Basically, if the caller has to know about the structure of an ES query in order to pass it to this function I find the claim that having the caller instead receive the filter and have to construct an ES query from it is any more exposing details about ES totally bogus.

@seanh
seanh Jul 20, 2015 Contributor

It's filters that are encapsulated here, api/search.py knows about Es queries of course but it doesn't know anything about filters yet, including how to insert them into query dicts, all that knowledge is in this one function in api/nipsa/search.py.

Filtering was something I was only introducing into the code for nipsa so for the sake of not adding new "generic" functions into other parts of the code that are only used for the nipsa function, and to minimize nipsa's footprint outside of h/api/nipsa/ (that footprint being just a couple of one-liners was one of my favourite things about this approach to nipsa), I encapsulated it here.

I imagine once we have nipsa and groups, two features that are probably both gonna use filters, then we'll need some generic inserting filters in queries code in api/search.py, though. I would wait until we have two users but I can do it now if you really want

@tilgovi
tilgovi Jul 20, 2015 Contributor

Yes. I would really prefer that.

@tilgovi
tilgovi Jul 20, 2015 Contributor

It's about making this function simple. There's no reason I can see why it needs to accept an argument that it has a contractual obligation not to modify, and therefore has to deep copy, when it never inspects that argument or modifies it anyway, just to prevent the caller from having to know about how to construct filtered queries even though it already knows its using elasticsearch and constructing elasticsearch queries.

The result is only going to be a proliferation of code that modifies elasticsearch query objects, like this code, rather than code that constructs the useful pieces of them to be assembled as needed by the caller.

@tilgovi tilgovi and 1 other commented on an outdated diff Jul 20, 2015
h/api/nipsa/search.py
+
+ if userid:
+ # Always show the logged-in user's annotations even if they have nipsa.
+ should_clauses.append({"term": {"user": userid}})
+
+ query["query"] = {
+ "filtered": {
+ "filter": {"bool": {"should": should_clauses}},
+ "query": query["query"]
+ }
+ }
+
+ return query
+
+
+def query_for_users_annotations(userid):
@tilgovi
tilgovi Jul 20, 2015 Contributor

Similarly, this could just return {"term", {"userid": userid}} and then it could be used by nipsa_filter to construct and return a boolean should filter.

@seanh
seanh Jul 20, 2015 Contributor

But again I don't want worker.py to do any more work than it already does, or to know anything about the structure of Es dicts. Encapsulation. But I could add a _user_filter() helper function to return {"term": {"user": userid}} that both query_for_users_annotations() and nipsa_filter() could call, save a little bit of duplication at the cost of readability.

@tilgovi tilgovi commented on an outdated diff Jul 20, 2015
h/api/nipsa/worker.py
+ "doc": {"not_in_public_site_areas": True}
+ }
+
+
+def remove_nipsa_action(annotation):
+ """Return an Elasticsearch action to remove NIPSA from the annotation."""
+ return {
+ "_op_type": "update",
+ "_index": annotator.es.index,
+ "_type": "annotation",
+ "_id": annotation["_id"],
+ "script": "ctx._source.remove(\"not_in_public_site_areas\")"
+ }
+
+
+def add_or_remove_nipsa(userid, action, es_client):
@tilgovi
tilgovi Jul 20, 2015 Contributor

The fact that we use add and remove here makes me think we definitely should change nipsa and unnipsa in the logic module.

@tilgovi tilgovi and 1 other commented on an outdated diff Jul 20, 2015
h/api/nipsa/worker.py
+ return {
+ "_op_type": "update",
+ "_index": annotator.es.index,
+ "_type": "annotation",
+ "_id": annotation["_id"],
+ "script": "ctx._source.remove(\"not_in_public_site_areas\")"
+ }
+
+
+def add_or_remove_nipsa(userid, action, es_client):
+ """Add/remove the NIPSA flag to/from all of the user's annotations."""
+ assert action in ("nipsa", "unnipsa")
+
+ if action == "nipsa":
+ query = nipsa_search.not_nipsad_annotations(userid)
+ else:
@tilgovi
tilgovi Jul 20, 2015 Contributor

Maybe we should be defensive and say explicitly if action == "unnipsa"?

@seanh
seanh Jul 20, 2015 Contributor

That's what the assert above is for

@tilgovi
tilgovi Jul 20, 2015 Contributor

Ah, okay. Totally didn't catch that.

@tilgovi tilgovi commented on an outdated diff Jul 20, 2015
h/api/nipsa/worker.py
+ }
+
+
+def add_or_remove_nipsa(userid, action, es_client):
+ """Add/remove the NIPSA flag to/from all of the user's annotations."""
+ assert action in ("nipsa", "unnipsa")
+
+ if action == "nipsa":
+ query = nipsa_search.not_nipsad_annotations(userid)
+ else:
+ query = nipsa_search.nipsad_annotations(userid)
+
+ annotations = search.scan(es_client=es_client, query=query, fields=[])
+
+ if action == "nipsa":
+ actions = [add_nipsa_action(a) for a in annotations]
@tilgovi
tilgovi Jul 20, 2015 Contributor

Could set a local variable above to the value add_nipsa_action or remove_nipsa_action and save the duplicated conditional. Might read more cleraly.

@tilgovi tilgovi and 1 other commented on an outdated diff Jul 20, 2015
h/api/search.py
@@ -31,7 +34,41 @@ def _match_clause_for_uri(uristr):
}
-def build_query(request_params):
+def scan(es_client, query, fields):
@tilgovi
tilgovi Jul 20, 2015 Contributor

This doesn't seem to add anything at all to helpers.scan. You could just export helpers.scan here if you wanted to. Or maybe you're deliberately trying to hide the extra functionality of helpers.scan? Any reason?

@seanh
seanh Jul 20, 2015 Contributor

I'm trying to encapsulate elasticsearch in this module. So this module calls elasticsearch, other modules call this module. I know we said in a call recently that encapsulating Es in one place was desirable but not if there was a good reason for some other code to access it directly. When I wrote this code though our rule was simpler: encapsulate it here.

So I guess the options would be:

  • worker.py imports elasticsearch directly and calls scan and bulk itself, but I consider accessing it directly a -1 and not sure there's a compelling enough reason to do it
  • from elasticsearch import bulk in search.py instead of a wrapper function, but then you're gonna get unused import warnings from prospector
@tilgovi
tilgovi Jul 20, 2015 Contributor

If we didn't have to pass the client in I would understand the encapsulation argument. We would actually be hiding the details of the implementation.

But when our arguments include elasticsearch client instances and elasticsearch-formatted query objects I think the encapsulation is bogus anyway and it's better not to pretend.

@seanh
seanh Jul 20, 2015 Contributor

Hmm, well the user just has to pass request.es_client, doesn't have to actually know anything about how Es works. But passing that through everything is boilerplatey. This makes me think that maybe the functions in search.py should be combined into an EsClient class, then request.es_client could become an instance of this class, then the code that wants to use it just does request.es_client.scan(query, fields). (This is how `request.api_client works too). What do you think?

@seanh
seanh Jul 20, 2015 Contributor

So

search.scan(es_client=es_client, query=query, fields=[])

becomes

request.es_client.scan(query=query, fields=[])

or in fact just

es_client.scan(query=query, fields=[])

since add_or_remove_nipsa() would just be passed the EsClient instance not the whole request.

@tilgovi
tilgovi Jul 20, 2015 Contributor

I would really just skip the abstraction entirely and we'll figure out what makes sense later.

@tilgovi
tilgovi Jul 20, 2015 Contributor

One does have to know how ES works in order to use these. One has to first know about scanning and bulk operations, and about the query object format.

@tilgovi tilgovi commented on an outdated diff Jul 20, 2015
h/api/search.py
+
+ :param es_client: The Elasticsearch client object to use
+ :type es_client: elasticsearch.Elasticsearch
+
+ :param query: The Elasticsearch query to search for,
+ in Elasticsearch query DSL format
+ :type query: dict
+
+ :param fields: The list of fields to return for each hit
+ :type fields: list
+
+ """
+ return helpers.scan(es_client, query=query, fields=fields)
+
+
+def bulk(es_client, actions):
@tilgovi
tilgovi Jul 20, 2015 Contributor

Same with this. Is there a need for us to just re-export stripped down versions of elasticsearch functions?

@tilgovi tilgovi commented on an outdated diff Jul 20, 2015
h/api/search.py
@@ -41,6 +78,11 @@ def build_query(request_params):
h search API
:type request_params: webob.multidict.NestedMultiDict
+ :param userid: the ID of the authorized user (optional, default: None),
+ if a userid is given then this user's annotations will never be
@tilgovi
tilgovi Jul 20, 2015 Contributor

This bit seems like too much implementation detail. The argument is the authorized userid, whether/how that factors into a feature like NIPSA seems not related to this interface to me.

@tilgovi tilgovi and 1 other commented on an outdated diff Jul 20, 2015
h/api/views.py
@@ -222,6 +224,9 @@ def _create_annotation(fields, user):
annotation['user'] = user.id
annotation['consumer'] = user.consumer.key
+ if nipsa.is_nipsad(user.id):
+ annotation["not_in_public_site_areas"] = True
@tilgovi
tilgovi Jul 20, 2015 Contributor

any reason not to just make this read "nipsa"? I think Elasticsearch may compress the keys but when we refer to this as nipsa absolutely everywhere else I don't see why we should spell it out here.

@seanh
seanh Jul 20, 2015 Contributor

This was just to give users looking at the documents from our api a clue what it means (though I now recall that we discussed hiding it from the output..)

@tilgovi
tilgovi Jul 20, 2015 Contributor

Up to you. I find it unwieldy and think we could solve it with documentation.

@seanh
Contributor
seanh commented Jul 20, 2015

@tilgovi Ok that's all the changes done. I have to run now, but that was a lot of changes (including a lot of big renames and stuff) so I want to manually test that everything's working and look over it again tomorrow before we merge it

@tilgovi
Contributor
tilgovi commented Jul 20, 2015

I'm going to rebase it, check it out, and see if I can't send it straight in. Thanks for all the effort and patience.

@tilgovi
Contributor
tilgovi commented Jul 21, 2015

I rebased this and I also changed the worker so that removal uses the index operation rather than update because it as easier to figure out how/where to turn on dynamic scripting support / document that / figure out if that works for hosting ES instances / store a script / blah blah blah. I could have gone with nipsa: false instead, as an update, but I'm just not that concerned about performance here.

@tilgovi
Contributor
tilgovi commented Jul 21, 2015

Last remaining blocker is that the should_send_event function in h.streamer has no idea how to handle NIPSA and so annotations from flagged users will show up in the stream if they are created/updated while a user is watching the stream or if they are revealed by infinite scroll. That's pretty poor NIPSA :(.

Unfortunately, that code is kind of a mess and I don't know what to do with it. It's going to be tricky to figure out when to let NIPSA hide real-time updates if we keep doing things as we have been. Obviously on the stream, yes, because that's public, but what about when on the standalone annotation page for a NIPSA'd annotation? I think clearly we should get that update. But right now the difference between the two cases is buried, from the server's persective, in the crazy payload of pseudo-Elasticsearch the client passes it over the WebSocket.

๐Ÿ˜ก

@seanh
Contributor
seanh commented Jul 21, 2015

What?? I was pretty sure annotations were being filtered out of the stream and infinite scroll but I'll double check. I didn't think about annotations that get created or updated while someone is looking at the page though

@seanh
Contributor
seanh commented Jul 21, 2015

Ok, I think the existing streamer.py solution does work for infinite scrolling:

  1. Create user fred, make some public annotations. Verify that both fred and non-logged-in visitors can see fred's annotations in the stream and sidebar.
  2. Create user seanh, make him an admin using the command line
  3. Use seanh to nipsa fred (make sure you have the nipsa worker process running)
  4. Verify that fred can still see his own annotations in the stream and sidebar, but neither seanh nor not-logged-in users can see them
  5. Use seanh to create lots of annotations that will appear before fred's in the stream, enough to push fred's annotations off the first page load and into the infinite scroll
  6. Verify that fred receives his annotation from infinite scroll, but seanh and not-logged-in users don't
@seanh
Contributor
seanh commented Jul 21, 2015

I discovered this issue:

  1. Log in as Fred (who is nipsa'd) and you can see Fred's annotations in the sidebar
  2. Log out, Fred's annotations are still visible
  3. Log in as Bob, Fred's annotations are still visible
  4. Reload the page and the nipsa'd annotations disappear

This doesn't happen with private annotations. What I did for nipsa was find the part of the code in streamer.py when it queries elasticsearch for the annotations, and insert the nipsa filter in there. So this seems to suggest that on logout/login it doesn't simply throw away all the annotations and request them again with the new authorization, but instead re-filters the already-fetched annotations somewhere.

I'm gonna vote for not bothering to fix this for nipsa.

@seanh
Contributor
seanh commented Jul 21, 2015

Ok, I figured out that I have to turn on the queue and streamer features in the db now, and then I can reproduce the problem that Randall mentioned: New (and possibly also updated) nipsa'd annotations will appear on the page via the websocket. This one should be fixed I think

@seanh
Contributor
seanh commented Jul 21, 2015

Ok that's the issue with new/updated annotations coming down the websocket fixed

@seanh
Contributor
seanh commented Jul 21, 2015

Standalone annotation pages I need to test but they seem to be broken for me on master right now, investigating..

@seanh
Contributor
seanh commented Jul 21, 2015

Ok, so live updates of annotations (nipsa'd or not) on individual annotation pages does seem to be working if the user looking at the page is the annotation creator himself (the only code I wrote to filter annotations out of the websocket only filters them out if they're nipsa'd and not created by the authorized user). It probably will not work for other users looking at the page, but I'm not sure how big a deal that is?

But there's a bigger problem - individual annotation pages are broken if the annotation is nipsa'd, and the authorized user is not the annotation creator. This is because the page loads a template with no annotation and then Angular makes an XHR to the search API searching for an annotation by ID, and of course nipsa'd annotations are filtered out.

I think the correct solution here is for the frontend to use the API endpoint for retrieving a single annotation, since with the current situation there's no (nice) way for the backend to detect these requests and treat them differently from other search requests.

How important is it to fix this, given the uselessness of individual annotation pages?

@tilgovi
Contributor
tilgovi commented Jul 21, 2015

Sounds right. We definitely want to make a single annotation request.

On Tue, Jul 21, 2015, 07:03 Sean Hammond notifications@github.com wrote:

Ok, so live updates of annotations (nipsa'd or not) on individual
annotation pages does seem to be working if the user looking at the page is
the annotation creator himself (the only code I wrote to filter annotations
out of the websocket only filters them out if they're nipsa'd and not
created by the authorized user).

But there's a bigger problem - individual annotation pages are broken if
the annotation is nipsa'd, and the authorized user is not the annotation
creator. This is because the page loads a template with no annotation and
then Angular makes an XHR to the search API searching for an annotation
by ID, and of course nipsa'd annotations are filtered out.

I think the correct solution here is for the frontend to use the API
endpoint for retrieving a single annotation, since with the current
situation there's no (nice) way for the backend to detect these requests
and treat them differently from other search requests.

โ€”
Reply to this email directly or view it on GitHub
#2300 (comment).

@tilgovi
Contributor
tilgovi commented Jul 21, 2015

Ugh, but I bet the point of that was to get the replies somehow...

Which leads to the question: if I'm viewing a NIPSA annotation, should I
see NIPSA replies?

Or, if I'm viewing a non-NIPSA annotation, should I see NIPSA replies?
Probably not... but how do we tell the difference between these?

On Tue, Jul 21, 2015, 07:14 Randall Leeds tilgovi@hypothes.is wrote:

Sounds right. We definitely want to make a single annotation request.

On Tue, Jul 21, 2015, 07:03 Sean Hammond notifications@github.com wrote:

Ok, so live updates of annotations (nipsa'd or not) on individual
annotation pages does seem to be working if the user looking at the page is
the annotation creator himself (the only code I wrote to filter annotations
out of the websocket only filters them out if they're nipsa'd and not
created by the authorized user).

But there's a bigger problem - individual annotation pages are broken if
the annotation is nipsa'd, and the authorized user is not the annotation
creator. This is because the page loads a template with no annotation and
then Angular makes an XHR to the search API searching for an
annotation by ID, and of course nipsa'd annotations are filtered out.

I think the correct solution here is for the frontend to use the API
endpoint for retrieving a single annotation, since with the current
situation there's no (nice) way for the backend to detect these requests
and treat them differently from other search requests.

โ€”
Reply to this email directly or view it on GitHub
#2300 (comment).

@seanh
Contributor
seanh commented Jul 21, 2015

Argh, I hadn't even considered replies.

There is a way the backend can detect these individual annotation requests and treat them differently, it can look at the Referer header for example. It's ugly, but...

@seanh
Contributor
seanh commented Jul 21, 2015

I think for replies, replies to nipsa'd annotations should be filtered out whenever the annotation is filtered out, but that's something I'll need to test for now and maybe fix

@seanh
Contributor
seanh commented Jul 21, 2015

Actually replies are currently shown under the annotation in both the stream and sidebar, so it's already the case that if an annotation isn't shown none of its replies are shown either. Is there anything to do here?

@seanh
Contributor
seanh commented Jul 21, 2015

Ok just wanna go through this systematically. The current implementation:

  1. Nipsa'd user replies to their own nipsa'd annotation:
    • The user can see their own annotation and reply
    • Other users will see neither.
  2. Nipsa'd user 1 replies to nipsa'd user 2's annotation (this requires user 2 to have been nipsa'd after the reply was made):
    • User 2 sees their annotation but not the reply.
    • User 1 sees neither, it seems kind of bad that they can't see their own reply, but then how would we even display it?
    • Other users see neither
  3. Nipsa'd user replies to non-nipsa'd annotation:
    • The nipsa'd user sees both annotation and reply
    • Others (including annotation author) see just the annotation, not the reply
  4. Non-nipsa'd user replies to nipsa'd annotation (requires the user to be nipsa'd after the reply):
    • The nipsa'd user sees both annotation and reply
    • Other users (including the author of the reply) see neither the annotation nor the reply. Weird that they can't see their own reply but not sure what we could do about it

I did a quick test of all of these and the above seems to be true on initial page loads and on web socket live updates. I did not test it all with infinite scroll as well but I think it'll be the same - it's the same filter rule.

@seanh
Contributor
seanh commented Jul 21, 2015

I'd say the right behaviour for an individual annotation page is to always show the annotation and all its replies, regardless of whether the annotation author or reply authors are nipsa'd and regardless of who the authorized user is. So in other words we just need to completely disable the nipsa filtering on those pages, it's currently enabled..

Sean Hammond and others added some commits Jun 8, 2015
Sean Hammond Add NIPSA (Not In Public Site Areas) feature
This feature allows admin users to flag users as "nipsa'd", meaning that
these users' annotations will not appear in "public site areas".
Currently "public site areas" are defined as basically everywhere:
nipsa'd annotations are filtered out from the results returned by the
search API, so they won't appear in the stream or in the sidebar for
example.

- h/api/nipsa/ contains a new nipsa package with:
* nipsa table in db that records ids of nipsa'd users
* search functions for inserting nipsa-filtering commands into Es query
  dicts
* logic functions for using the nipsa feature, e.g. for nipsa'ing or
  unnipsa'ing a user. or asking whether a user is nipsa'd
* a worker function that adds the nipsa flag to or removes the nipsa
  flag from all of a user's past annotations

h/api/search.py: Now runs Es queries through the nipsa-filtering function

h/api/views.py: When creating a new annotation, if the user is nipsa'd
it now adds the nipsa flag to the annotation.

h/nipsa.py: A simple HTML page for admin users to nipsa and unnipsa
users.

Note on app/API separation: We don't want the app to read or write
database tables belonging to the API or vice-versa, nor do we want the
API to import modules defined in the app. The obvious place to put the
nipsa flag would be in a column on the user table (which belongs to the
app), but then when the user creates a new annotation the app would have
to communicate somehow to the API that the authorized user is nipsa'd
without the API reading from the user table or calling a function
defined in the app. Therefore the nipsa flag has been stored in a
separate nipsa table owned by the API instead.
7b80248
Sean Hammond Make a line more idiomatic 4bf2b51
Sean Hammond Use the .query property on h.db.Base e27057b
@tilgovi tilgovi Remove unused import 9203d39
@tilgovi tilgovi not_in_public_site_areas -> nipsa 27aebc9
Sean Hammond Don't hardcode the Elasticsearch URL
Add a request.es_client property instead
7fb6692
Sean Hammond Change user_id -> userid everywhere 5a42ee1
Sean Hammond Enable nipsa'ing users who don't exist
I don't think it's a good idea to restrict nipsa to users who we have a
row for in the users table. The model (nipsa'd user IDs in a separate
table with no relation to the user table) allows us to be more flexible
and nipsa any userid. Taking advantage of that avoids potential future
problems such as:

- Nipsa'ing users whose accounts have been deleted (if in the future we
  enable deleting accounts and it removes the row from the users table)
- Nipsa'ing users from third-party account providers (if in the future
  we enable external accounts and it's implemented in such a way that
  the users have no row in our users table)

So nipsa is just a simple list of userids. Each id may or may not exist
in our users table as well, it doesn't matter. And any annotations
created by these ids are hidden.
209345c
Sean Hammond Give the API its own SQLAlchemy base class 6375b07
Sean Hammond PEP8 c81fb57
Sean Hammond Add a couple of docstrings 685cbc6
Sean Hammond Fix API model tests 8e30f5e
Sean Hammond Change is_nipsad -> has_nipsa 6cf7c94
Sean Hammond Refactor nipsa_filter()
Just return the filter part of the Es query, don't handle inserting it
into the query.
a6d9e85
Sean Hammond Rename nipsa -> add_nipsa, unnipsa -> remove_nipsa everywhere b6e3364
Sean Hammond Refactor add_or_remove_nipsa() 8a344d3
@tilgovi tilgovi Expose NispaUser in h.models 0bfd969
Sean Hammond Remove scan() and bulk() wrappers 37f4003
Sean Hammond Docstring tweak 9e9d803
Sean Hammond Fix broken route e7b4d7f
@tilgovi tilgovi Fix nipsa templates to use .jinja2 extension 88e4d6e
@tilgovi tilgovi NIPSA without dynamic scripting support 311afd8
Sean Hammond Filter new nipsa'd annotations from the WebSocket
When a nipsa'd user creates or updates an annotation this creates an NSQ
event that goes through streamer.py and is sent via a WebSocket to any
users viewing the same page. Those users will see the annotation appear
in their browser without refreshing the page.

Don't send nipsa'd annotations down the WebSocket in this way, except to
clients where the logged-in user is the nipsa'd user who created the
annotation (nipsa'd users can see their own annotations).
db4742a
Sean Hammond Rename BASE -> Base 7f573d5
Sean Hammond Rename BASE -> Base dc3bae5
Sean Hammond Disable NIPSA-filtering on annotation permalink pages
When on individual annotation pages like /a/<annotation_id> make an API
request to /api/annotations/<annotaion_id> instead of to the search API.
This bypasses the search API's NIPSA-filtering.

If you have a direct link to an annotations individual page, then you
can see the annotation (and all its replies) regardless of whether
anything is NIPSA'd or any user is logged-in.
e9accc5
@nickstenning nickstenning and 1 other commented on an outdated diff Jul 21, 2015
h/accounts/__init__.py
"""Make the given user an admin."""
- user = models.User.get_by_username(user_id)
+ user = models.User.get_by_username(userid)
@nickstenning
nickstenning Jul 21, 2015 Member

"acct:foo@bar.com" is a userid, "foo" is a username. get_by_username accepts a username. This param name should probably change.

@tilgovi
tilgovi Jul 21, 2015 Contributor

Or use get_by_id.

Sean Hammond added some commits Jul 21, 2015
Sean Hammond Fix annotation-viewer-controller-test.coffee
The single it() in this test module was not being run (it was indented
inside a beforeEach()) and dedenting it reveals that the beforeEach() is
actually crashing anyway (controller somehow not registering with the
$controller service, looks like).

Rewrite the module to work and be clearer.
b628e83
Sean Hammond Add an annotation-viewer-controller.coffee test
8a788fa
Sean Hammond Rename userid -> username
7594379
@seanh
Contributor
seanh commented Jul 21, 2015

Ok that's the individual annotation pages fixed to use the annotation read API instead of search

@seanh
Contributor
seanh commented Jul 21, 2015

Ok, ready again. To summarise the last couple of days:

  • Responded to Nick's review comments
  • Responded to Randall's review comments
  • There's now a request.es_client with the Elasticsearch URL from the config file
  • The API has its own sqlalchemy base class
  • You can nipsa a userid that doesn't exist in the user table
  • "Live update" annotations are filtered for nipsa
  • Checked that "infinite scroll" annotations were already being filtered
  • Checked behaviour of nipsa with replies
  • Annotation permalink pages now use annotation read URL not search, don't get nipsa'd
  • annotation-viewer-controller-test.coffee has been rewritten, is working now

Maybe I'll realise in the morning something I missed or did wrong

@seanh
Contributor
seanh commented Jul 22, 2015

@nickstenning @tilgovi This is ready for review again, whichever one of you wants it. I'd just ask that you do a quick manual test of everything, just because there were quite a lot of last-second changes (admins can nipsa and unnipsa users; the worker process runs and old annotations appear and disappear from the stream; new annotations from nipsad users are hidden; nipsad users can see their own annotations; everyone can see a nipsad annotation if they go to its permalink page; nipsad annotations are hidden from infinite scroll and live update...)

@nickstenning nickstenning commented on an outdated diff Jul 22, 2015
h/accounts/views.py
@@ -605,15 +605,15 @@ def index():
def create(self):
"""Make a given user an admin."""
try:
- user_id = self.request.params['add']
+ userid = self.request.params['add']
@nickstenning
nickstenning Jul 22, 2015 Member

Same comment here about userid vs username.

@nickstenning nickstenning and 2 others commented on an outdated diff Jul 22, 2015
h/api/nipsa/search.py
+ "query": {...}
+ }
+ }
+ }
+
+ :param userid: The ID of a user whose annotations should not be filtered.
+ The returned filtered query won't filter out this user's annotations,
+ even if the annotations have the NIPSA flag.
+ :type userid: unicode
+
+ """
+ # If any one of these "should" clauses is true then the annotation will
+ # get through the filter.
+ should_clauses = [{"not": {"term": {"nipsa": True}}}]
+
+ if userid:
@nickstenning
nickstenning Jul 22, 2015 Member

... such as here. IMO this should be if userid is not None:

@seanh
seanh Jul 22, 2015 Contributor

I think what this comes down to is what we think it should do if the user code passes in an empty string for the userid: current behaviour is not to add any userid filter, the Es query will work as if there's no authorized user. If I changed it to is None it would add a {"term": {"user": ""}} filter which I'm guessing will cause a traceback from Es. Or we could explicitly check for the empty string and raise our own exception here. I think the current implementation is as defensible as any of the other options, isn't it?

@nickstenning
nickstenning Jul 22, 2015 Member

I don't really understand. If you pass user=123, or user=sys.stdout, this code is still going to do The Wrong Thing. As far as I'm concerned, this is again a sentinel-vs-falsy distinction: the API of this function says "the userid parameter must be a valid userid, or the sentinel None, to indicate that there is no authenticated user". So we we should be precise and check for that sentinel.

@seanh
seanh Jul 22, 2015 Contributor

So you're saying we assume the value is either None or a valid userid (so not the empty string and not falsy). In that case if userid: and if userid is not None: are the same behaviour-wise and it comes down to style. Our style guide says to use the implicit false if at all possible and here it is possible to get the right behaviour by using it ... So unless we want to introduce a new style rule along the lines of always use is None instead of the implicit false, or always use it in case of None as a sentinel value, or something.

@tilgovi
tilgovi Jul 22, 2015 Contributor

I think it should be explicit here and I don't think that violates the
style guide, either.

For me, it's just not the place for this function to judge whether an empty
string is a valid username. I know that sounds silly, but we're really
trying to check "was I passed an argument", no? Check that.

On Wed, Jul 22, 2015, 07:58 Sean Hammond notifications@github.com wrote:

In h/api/nipsa/search.py
#2300 (comment):

  •                "query": {...}
    
  •            }
    
  •        }
    
  •    }
    
  • :param userid: The ID of a user whose annotations should not be filtered.
  •    The returned filtered query won't filter out this user's annotations,
    
  •    even if the annotations have the NIPSA flag.
    
  • :type userid: unicode
  • """
  • If any one of these "should" clauses is true then the annotation will

  • get through the filter.

  • should_clauses = [{"not": {"term": {"nipsa": True}}}]
  • if userid:

So you're saying we assume the value is either None or a valid userid (so
not the empty string and not falsy). In that case if userid: and if
userid is not None: are the same behaviour-wise and it comes down to
style. Our style guide says to use the implicit false if at all possible
and here it is possible to get the right behaviour by using it ... So
unless we want to introduce a new style rule along the lines of always use is
None instead of the implicit false, or always use it in case of None as a
sentinel value, or something.

โ€”
Reply to this email directly or view it on GitHub
https://github.com/hypothesis/h/pull/2300/files#r35222179.

@nickstenning nickstenning and 1 other commented on an outdated diff Jul 22, 2015
h/api/test/views_test.py
+
+ assert annotation.get("nipsa") is True
+
+
+@patch("h.api.views.Annotation.save")
+@patch("h.api.views.nipsa.has_nipsa")
+@patch("h.api.views.get_user")
+def test_create_does_not_add_nipsa_flag(get_user, has_nipsa, _):
+ get_user.return_value = Mock(
+ id="test_id", consumer=Mock(key="test_key"))
+ has_nipsa.return_value = False
+ request = DummyRequest(json_body=_new_annotation)
+
+ annotation = views.create(request)
+
+ assert not annotation.get("nipsa")
@nickstenning
nickstenning Jul 22, 2015 Member

Another example of where this should be an explicit is None check, in my view. This test will pass if the nipsa field is set to the empty string, but that probably won't play well with the elasticsearch filters elsewhere in the PR.

@seanh
seanh Jul 22, 2015 Contributor

Hmm, I would have thought Es would play fine with that (the Es filter is a positive one: filter out anything that does have nipsa: true). But I guess what I really want here (if we want to test specifically what the actual behaviour is) is "nipsa" not in annotation

@nickstenning nickstenning commented on the diff Jul 22, 2015
h/conftest.py
@@ -91,6 +92,8 @@ def db_session(request, settings):
engine = db.make_engine(settings)
db.bind_engine(engine, should_create=True, should_drop=True)
+ api_db.use_session(db.Session)
@nickstenning
nickstenning Jul 22, 2015 Member

This isn't necessary. You're already calling db.make_engine three lines earlier, and that calls this, no?

@seanh
seanh Jul 22, 2015 Contributor

No, make_engine() doesn't call use_session(), the includeme() does. I could refactor h/db.py and h/api/db.py to make this line unnecessary here though.

@nickstenning
nickstenning Jul 22, 2015 Member

Oh, shit, sorry. My misreading. No, it's fine.

@nickstenning nickstenning commented on an outdated diff Jul 22, 2015
h/api/nipsa/models.py
@@ -0,0 +1,30 @@
+import sqlalchemy
+from sqlalchemy.orm import exc
+
+from h.api import db
+
+
+class NipsaUser(db.Base):
+
+ """A NIPSA entry for a user (SQLAlchemy ORM class)."""
+
+ __tablename__ = 'nipsa'
+ userid = sqlalchemy.Column(sqlalchemy.UnicodeText, primary_key=True,
@nickstenning
nickstenning Jul 22, 2015 Member

You probably want Unicode, not UnicodeText here.

@nickstenning nickstenning commented on an outdated diff Jul 22, 2015
h/api/nipsa/models.py
@@ -0,0 +1,30 @@
+import sqlalchemy
+from sqlalchemy.orm import exc
+
+from h.api import db
+
+
+class NipsaUser(db.Base):
+
+ """A NIPSA entry for a user (SQLAlchemy ORM class)."""
+
+ __tablename__ = 'nipsa'
+ userid = sqlalchemy.Column(sqlalchemy.UnicodeText, primary_key=True,
+ index=True)
@nickstenning
nickstenning Jul 22, 2015 Member

AIUI there's nothing gained by adding an index on a column that already has a PRIMARY KEY constraint.

@nickstenning nickstenning commented on an outdated diff Jul 22, 2015
@@ -61,13 +61,13 @@ def init_db(args):
def admin(args):
"""Make a user an admin."""
paster.bootstrap(args.config_uri, request=Request.blank(''))
- accounts.make_admin(unicode(args.user_id, sys.getfilesystemencoding()))
+ accounts.make_admin(unicode(args.userid, sys.getfilesystemencoding()))
@nickstenning
nickstenning Jul 22, 2015 Member

This isn't a userid.

@seanh
Contributor
seanh commented Jul 22, 2015

In my defense, all of the requested userid -> username changes are on parts of the code that are already using user_id instead of username on master, they only appear in this diff because I changed user_id to userid everywhere. Admittedly user_id is on master because of an earlier pull request by me, though. Anyway I'll make the changes.

Sean Hammond added some commits Jul 22, 2015
Sean Hammond Remove an unnecessary database index 048fd21
Sean Hammond Use Unicode instead of UnicodeText
And just pick an arbitrary long length
3634ccf
Sean Hammond Tweak a test assert
ff51906
@nickstenning
Member

Okay, I've checked this out and tested everything I can think of testing. It all seems to work!

Sean Hammond Change `if userid:` -> `if userid is not None:`
9b84fcd
@nickstenning
Member

It's been a long road, but I think we're there.

Thank you for all your hard work on this Sean.

๐Ÿฐ

@nickstenning nickstenning merged commit 2b1bcd1 into master Jul 22, 2015

2 of 3 checks passed

coverage/coveralls Coverage pending from Coveralls.io
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
@nickstenning nickstenning deleted the nipsa branch Jul 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment