Skip to content
This repository has been archived by the owner on Jan 24, 2019. It is now read-only.

API for Discover hashtags #727

Closed
xmatthewx opened this issue Oct 21, 2015 · 30 comments
Closed

API for Discover hashtags #727

xmatthewx opened this issue Oct 21, 2015 · 30 comments

Comments

@xmatthewx
Copy link

cc @vazquez

@xmatthewx xmatthewx self-assigned this Oct 21, 2015
@xmatthewx xmatthewx added this to the 1.5.0 milestone Oct 21, 2015
@vazquez
Copy link

vazquez commented Oct 21, 2015

🍕

Client includes a single text field called description limited to 100 characters. A description can include #hashtags.

API should be able to:

  • Store description as metadata on project
  • Extract a list of #hashtags
  • Index #hashtags and store a list of related projects
  • Query for a list of projects that contain a specific #hashtag
  • Cache the project list for recent queries
  • Query for a list of featured hashtags
    • flagged manually by an admin

@xmatthewx
Copy link
Author

Thanks @vazquez. I made a few edits above, but your draft 🍕 tasted great. I will discuss this with @cadecairos and @simonwex next week.

@xmatthewx xmatthewx modified the milestone: 1.5.0 Oct 21, 2015
@simonwex
Copy link

This design of tags from MakeAPI might be a useful input

This Just In: Avoid Unnecessary Schema Changes with Tags!

Feel free to remove if not relevant.
Paula Le Dieu: Don't want to cause confusion but here is some work we have been doing in the MakeSmiths team about the kinds of tags/machine attributes we think would be useful for Makes.
https://webmakers.etherpad.mozilla.org/ContentPlanTags < work in progress


Users:
Author:
The creator of the original Make or the Remixer. -- The person with the publish button.

Admin:
An employee (and perhaps in future a contributor) of

Prefix Types:

Hash Tags & Raw Tags:
    Any tag prefixed by a hash can be added by the author

    For example: #mozfest, kittens, #teachtheweb, #sogladitsfriday, css

Application Tags:
    The application can use reserved *.webmaker.org prefixes for tags. Admin only.

    Ex: 
    webmaker.org:featured  <- This would be featured on webmaker.org home
    thimble.webmaker.org:featured <- this would be featured on the thimble home page
    webmaker.org:project <- this is a webmaker project
    webmaker.org:challenge <- this is a webmaker challenge
    webmaker.org:kit <- this is a webmaker kit

User Tags

    These tags can be added by any authenticated user. The tags are prefixed by their user_id. -- In the case of Persona, this is their email address.

    ex:
    simon@simonwex.com:awesome
    simon@simonwex.com:favourites <-- this could be used as "playlist" functionality

Mention Tags? -- Tagging other users, question mark.


Question:

<meta name="webmaker-tag:kit">
thimble.webmaker.org:kit

<!-- this requires admin -->

<meta name="webmaker-tag" content="webmaker.org:kit,thimble.webmaker.org:featured,foo">

Would result in:

webmaker.org:kit <-- application tag
thimble.webmaker.org:featured <-- application tag
foo <-- raw tag

There will be no translation

tags.indexOf("webmaker.org:kit")
tags.has("...")

@xmatthewx
Copy link
Author

@gvn - can you take a look at the short list above: #727 (comment) to see if this reflects a minimal spec for hashtags? MakeAPI might be a useful reference, but includes more complexity than we require.

@cadecairos
Copy link
Contributor

I had an idea how to implement this and ran with it:

https://github.com/mozilla/api.webmaker.org/compare/develop...cadecairos:hashtags?expand=1

POC code here. Adds a description(varchar) and metadata (jsonb) field to projects. hashtags in description are captured and saved to metadata.tags. metadata.tags have an index that lets us do fancy shmancy queries, pretty quickly.

This patch adds in migration code, and support for description/metadata to the create project route, and adds a GET /projects/tagged/{tag} route.

IMO, getting this to a shippable state shouldn't take more than several days if this approach is acceptable. Just need to add support for description/metadata fields as required and write some tests.

(edit: for server-side support. not front-end :))

@gvn
Copy link
Contributor

gvn commented Oct 30, 2015

That spec looks pretty good to me. 👍

@cadecairos A couple things:

  1. It looks like your regex can be simplified to just #([A-Za-z\d]+).
  2. Does your code handle removing tags if a project description changes?

General question to all: Are we going to try and allow non alphanumeric hash tags? (EG: Chinese characters? #指事字)

@xmatthewx
Copy link
Author

Thanks @cadecairos for diving into this.

Good thinking @gvn on removing tags if that don't appear in an updated description.

Re: non-alphanumeric characters - let's worry about that later, if we discover a demand.

@cadecairos
Copy link
Contributor

@gvn yeah, updating tags when description changes is easy, I just haven't done that part yet :)

@xmatthewx
Copy link
Author

I added one more bit above #727 (comment). We should be able to query for hashtags flagged as featured (an admin can flag these manually).

@xmatthewx
Copy link
Author

Cool. We agreed that tag extraction will be handled server side. "Featured" tags will initially be set up as an environment variable, delivered along with the current default Discover payload. It might include project metadata if it doesn't add too much overhead to reduce http hits and speed things up for users.

@cadecairos will draft some example requests and responses for discussion.

Anything else?

@cadecairos
Copy link
Contributor

@gvn
Copy link
Contributor

gvn commented Nov 3, 2015

I just left some comments in the gist... 👯

@xmatthewx
Copy link
Author

Cool. Dropped in some comments as well.

@cadecairos
Copy link
Contributor

I put a bit more work into this while on the plane to Mozfest.

mozilla/api.webmaker.org@develop...cadecairos:hashtags

The create/update project routes now accept description as a payload param, which is parsed into an array of tags that is saved into a jsonb column called "metadata" as { tags: [] }. This column is indexed to improve searching.

There are two ways to search for tags: via GET /projects/tagged/{tag} and via GET /projects/tagged?tags=tag1,tag2,tag3. the first searches for a single tag as a route param, the second searches for up to ten tags (arbitrary number), where the makes need to have only one tag to count as a hit.

I'm not sure if that second route should search for makes matching just one or all tags in the query, but it's a simple operator change. We could also include a flag that chooses the type of match operator dynamically.

@gvn
Copy link
Contributor

gvn commented Nov 5, 2015

I think the second route should probably be projects matching any of the tags by default, but it might be nice to have a flag for matching all tags. However, off the top of my head I can't think of a good use case for matching all... ¯\_(ツ)_/¯

@xmatthewx
Copy link
Author

I don't think we'll have a need for advanced search for multiple tags with and/or. Let's just focus on the first route for now.

@xmatthewx xmatthewx changed the title Write requirements for Discover hashtags API for Discover hashtags Nov 9, 2015
@xmatthewx xmatthewx modified the milestones: 1.6.0, 1.5.0 Nov 9, 2015
@xmatthewx xmatthewx added p1 and removed p1 labels Nov 12, 2015
@xmatthewx xmatthewx mentioned this issue Nov 16, 2015
8 tasks
@xmatthewx xmatthewx assigned cadecairos and unassigned xmatthewx Nov 16, 2015
@xmatthewx
Copy link
Author

@cadecairos - let's discuss the status of this on our Tuesday stand. If you that time doesn't work for you, grab another time slot for you, me, @gvn, @alanmoo.

@cadecairos
Copy link
Contributor

@xmatthewx can I get an invite to that stand?

@xmatthewx
Copy link
Author

11:30 eastern tomorrow. Not on your calendar? Oye.

@xmatthewx
Copy link
Author

Moving along:

  • @gvn has WIP project description in settings
  • @cadecairos will post quick documentation on tag routes before wrapping up tests and pushing this staging

@xmatthewx
Copy link
Author

Note: we've reduced limit on description from 140 to 100 characters.

@cadecairos
Copy link
Contributor

@xmatthewx
Copy link
Author

thanks @cadecairos for the speedy work!

@xmatthewx
Copy link
Author

hey @cadecairos - all of us south of the border have PTO the second half of this week. Ping us with any questions about tags before we vanish. It'd be great to run with this full speed when we return.

@cadecairos
Copy link
Contributor

Should I filter the tag suggestion results for bad words? Might be heavy handed to blacklist words from description and tags completely but suggesting "shit" when someone types "shi" (ship?) sounds undesireable

@xmatthewx
Copy link
Author

Tag suggestions are not part of MVP. Let's get the basics up and running first. See how people use them before build out other bits.

@cadecairos
Copy link
Contributor

I already built that while waiting for code review, but I'll separate it out into a different patch, in case we need it later.

@xmatthewx
Copy link
Author

Thanks @cadecairos - drop an update here when you're ready for @gvn to hook this up.

@xmatthewx
Copy link
Author

@gvn - description api has landed on staging. check in with @cadecairos about pushing tags there too. tests are almost complete. Might be good to have that in staging for you while work is completed.

@xmatthewx
Copy link
Author

Done and done.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants