Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL Schema #51

Closed
trwnh opened this issue Jun 1, 2018 · 10 comments
Closed

URL Schema #51

trwnh opened this issue Jun 1, 2018 · 10 comments
Assignees

Comments

@trwnh
Copy link
Member

trwnh commented Jun 1, 2018

Right now, image and comment permalinks share a path of site.tld/p/username/number, with no differentiation. I assume "p" stands for "post", but the following concerns arise:

  1. You cannot tell whether a given link leads to an image or to a comment.
  2. You cannot interact with a given image unless you click again to load its permalink.

Ideally, a comment's permalink would be somehow nested under the image that it refers to, and the image permalink would be nested under the username.

Also:

  1. Numbers are sequential across the entire instance -- this means that with minimal work, any given URL can be guessed.
  2. Usernames are using the same namespace as any other subdirectory (thanks @EliotBerriot)

A sample permalink that doesn't have as many issues would be: http://pixelfed.social/@trwnh/post/612987361475/comment/1

@agateblue
Copy link

Also, I'm wondering what would happpen if a user registered as "site", since the user namepsace is at root without @ or else. Would this prevent the user from being accessed, or would it override the /site url ?

@dansup
Copy link
Member

dansup commented Jun 2, 2018

@trwnh

You cannot tell whether a given link leads to an image or to a comment.

Every post and reply are stored in the same table, the biggest difference between the two is one contains a media relationship with a photo, the other is a reply to another post. I wouldn't mind having comment/replies under /c/ instead of /p/.

You cannot interact with a given image unless you click again to load its permalink.

Yes, that is by design. Its meant to be able to link to a specific comment/reply

Numbers are sequential across the entire instance -- this means that with minimal work, any given URL can be guessed.

First off, each post or comment requires a valid username and id. You cannot just increment the id, you need to have the valid username for that post. Secondly, eventually the APIs will be ready so anyone will be able to see every (public) post anyways.

Usernames are using the same namespace as any other subdirectory

Yes, I am using the same routing path that Instagram uses.

Possible solutions:
http://example.org/p/trwnh/612987361475/comment/1612987361475
http://example.org/c/trwnh/1612987361475
http://example.org/p/trwnh/612987361475/c/1612987361475

@trwnh
Copy link
Member Author

trwnh commented Jun 2, 2018

Every post and reply are stored in the same table

Shouldn't they be stored in different ones, then? There's a functional difference between a comment and an image-post -- #84 is related to this, I think.

Its meant to be able to link to a specific comment/reply

That's fine, but linking to a specific comment should highlight that comment on the parent image. Right now, comment permalinks make it look like an image didn't get more than 2 comments.

each post or comment requires a valid username and id

Yes, but right now it's too easy to find comments by iterating over a much smaller address space. As I explained a few minutes ago in IRC: pixelfed.social/p/trwnh/276 can easily modify the 276 part to find not just the posts, but also the comments.

eventually the APIs will be ready so anyone will be able to see every (public) post anyways

Will the API include fetching every single comment that a person made?

I am using the same routing path that Instagram uses

Ideally this would be rethought, because Instagram's URLs are pretty ugly and not very useful for denoting any sort of structure. Right now, a sample Instagram URL looks something like

https://www.instagram.com/p/BQ4fS0wDK3P/?taken-by=birdsounds

which is not really clear how that's structured. In fact, the parameter seems to be entirely optional, since
https://www.instagram.com/p/BQ4fS0wDK3P/
will load the same post.

It makes more sense IMO to use a URL structure that reflects the hierarchy: users make posts which receive comments, which is why I proposed
pixelfed.social/@user/postid/comment/commentid
For example,
pixelfed.social/@trwnh/BQ4fS0wDK3P/c/1
has the following characteristics:

  • Its top-level is the user, demarcated by a symbol to prevent certain usernames from conflicting with other paths (like naming a user user or site or timeline)
  • the post ID isn't as easy to guess as 276 -- either include characters to expand your base from 10 to 36 or 62 (for shorter URLs), or include more numbers (perhaps more familiar and easy to type)
  • comments can be sequential because they are immediately visible and don't require any effort to search out

Like the second bullet above explains, it could be

pixelfed.social/@trwnh/BQ4fS0wDK3P/c/1 or
pixelfed.social/@trwnh/100132771334471774/c/1

depending on which is more valuable to you.

@agateblue
Copy link

@dansup regarding the username space, I think instangram relies on a check during signup to ensure username does not conflict with another URL. Unless you have a similar check on your side, you may have issues.

The idea of useing the @ to denote a user profile in the username is probably easy enough to set-up, clear enough for users, and good enough to solve the namespacing issue without putting additional logic during signup.

@dansup dansup changed the title Permalink structure is confusing / has potential security issues URL Schema Jun 4, 2018
@dansup
Copy link
Member

dansup commented Jun 4, 2018

pixelfed.social/p/trwnh/100132771334471774/c/100132771334471775 it is!

@dansup
Copy link
Member

dansup commented Jun 4, 2018

This won't break old links,

pixelfed.social/p/trwnh/100132771334471774 would redirect to pixelfed.social/p/trwnh/100132771334471774/c/100132771334471775 if it was a comment!

@trwnh
Copy link
Member Author

trwnh commented Jun 4, 2018

[01:09] <trwnh>   | dansup: one last question i think, what was the reasoning for
                  | using /p/ instead of @ before usernames in urls? or is there a
                  | possibility of using "pretty url" to mask this in the future?
[01:09] <@dansup> | trwnh: im using the same convention instagram uses
[01:10] <@dansup> | just like how mastodon uses twitters and not gnu socials
[01:10] <@dansup> | pleroma uses /notice/id
[01:10] <@dansup> | like GS
[01:10] <trwnh>   | yeah just wondering about the technical merits of each
[01:10] <trwnh>   | and if there was a reason to prefer one over the others
[01:10] <@dansup> | notice/id is more performant
[01:11] <trwnh>   | ah
[01:11] <@dansup> | if i did /p/{id}?taken-by={username} that would be more IG like
[01:11] <trwnh>   | yeah but that would be a nightmare for visibility
[01:11] <trwnh>   | technically nothing wrong with it because dropping the taken-by
                  | still loads the same post
[01:12] <@dansup> | the taken-by loads the modal
[01:12] <@dansup> | oh nvm
[01:12] <trwnh>   | not entirely, it actually loads that url in a modal when clicking from
                  | the profile page, but cold-loading it in a new tab opens the post page
[01:12] <@dansup> | yeah
[01:12] <trwnh>   | it ends up being ignored entirely
[01:12] <trwnh>   | which is pretty weird
[01:13] <trwnh>   | functionally i think that URL should show the uploader so that any
                  | casual observor knows they took it
[01:13] <trwnh>   | i have no strong feelings about @ vs /p/ but i slightly prefer @
[01:14] <trwnh>   | i think it might actually be better to use /p/ internally because
                  | it isn't a symbol
[01:14] <trwnh>   | in a similar way to how mastodon actually uses /users/55816 internally
[01:14] <trwnh>   | but the pretty-url shows it as /@user
[01:15] <trwnh>   | that would probably be the better compromise because it would be
                  | a more ideal solution dansup
[01:15] <@dansup> | its not used internally, just for routing. only need to change
                  | Status model and routes file
[01:16] <trwnh>   | correct me if i'm wrong but it would still be possible to set up
                  | pretty-url in the future, right? they would both resolve internally
                  | to the same thing
[01:16] <@dansup> | yes
[01:16] <@dansup> | would only take a few lines of code to support a legacy route
[01:17] <trwnh>   | end goal imo is to be able to share with my friends 
                  | pixelfed.social/@trwnh/1276/comment/7639 or whatever
[01:17] <@dansup> | trwnh: you can re-open #51 and mention what you just said
[01:17] <@dansup> | so that I remember, please
[01:17] <@dansup> | I'm working on adding a polymorphic relation to the
                  | notifications table to support better notifications
[01:17] <trwnh>   | so that navigating the structure can easily be added or removed
                  | by one term
[01:17] <trwnh>   | ah ok i'll do that!

@dansup dansup self-assigned this Jun 4, 2018
@trwnh
Copy link
Member Author

trwnh commented Jun 5, 2018

Minor error in new schema: seems username gets replaced incorrectly.

Example post: https://pixelfed.social/p/trwnh/2027
Example comment: https://pixelfed.social/p/lilletale/2027/c/2040
Expected comment URL: https://pixelfed.social/p/trwnh/2027/c/2040

@dansup
Copy link
Member

dansup commented Nov 22, 2018

Fixed in latest commits!

@trwnh
Copy link
Member Author

trwnh commented Jun 11, 2019

Reopening to keep track of implementing updated schema.

ActivityPub id and url

  • user id = /u/userid
  • user url = /@username or /username
  • post id = /p/postid
  • post url = /@username/postid or /username/postid
  • activity id = /p/postid/create? /u/userid/follows/followid? /u/userid/follows/followid/undo??

Needs:

  • assign user ids
  • map username to id
  • assign simpler post ids
  • assign consistent activity ids?
  • update router to 302 old ids to new ids
  • cache all this in redis
IRC discussion
dansup | so i think you won that argument about semi-nomatic identities
 trwnh | obviously the simplest schema is /p/postid and /u/userid
 trwnh | the important thing is not to mix user/post ids
 trwnh | aside from those two things i don't have any hard reqs
 trwnh | the stub can be anything
dansup | there will be no collision since the prefix /p/ and /u/
dansup | posts use snowflake ids
 trwnh | the only thing i'm not 100% aware of is the database stuff
 trwnh | you know more about that than me
dansup | I'd like to use a vanity url of /u/username that either redirects to /u/id or vice versa
 trwnh | yeah
dansup | that way we could allow username changes
 trwnh | i'm personally biased toward /@username as a vanity url
dansup | since the AP url would be an id
 trwnh | but /u/ is fine too
dansup | actually that would be a lot better
 trwnh | AP url does not have to be an id at all
 trwnh | it just has to be dereferencable (so that means http(s) right now)
dansup | because we can redirect /username to /@username without hitting the db
 trwnh | at some point there needs to be a lookup to translate /username to /u/id, right?
 trwnh | i'm conceiving a router of some sort
 trwnh | or /@username to /u/id
dansup | in routes/web.php that would be Route::redirect('{username}', '@{username}'); iirc
dansup | at the very bottom
 trwnh | what does that imply though
dansup | otherwise it would conflict with every URL
 trwnh | the use of a prefix is effectively to namespace usernames and avoid conflicts
 trwnh | if you prefix all usernames with /@ or /u/ then you no longer have to have as many "reserved" usernames
 trwnh | regardless of whatever happens on the backend after the initial request
 trwnh | so to map it out, if we had urls in the form /@username/12345678 then someone trying to query that url would do a database lookup for post 12345678
 trwnh | if @username changed their username to @username2 then the old url could still resolve as long as the post id didn't change
dansup | yeah
 trwnh | functionally, /@username/123 == /@username2/123 == /p/123
 trwnh | it shouldn't matter which route the end user takes, the same post should be served up regardless
 trwnh | but for shareability purposes it looks nice to have the username be part of the url (even if it's not part of the id)
 trwnh | at least, speaking as someone who determines which links to click based partially on who posted them
 trwnh | like if i'm browsing masto and i see a url that starts with https://domain.tld/@baduser/... then i know i don't even need to bother clicking, even if domain.tld is usually ok
 trwnh | or even simply if it's something uninteresting, like "i don't care to click on any link from pixelfed.social/@trwnh
 trwnh | you get what i'm saying, right?
dansup | we need to do a canonical meta tag
dansup | yeah
 trwnh | lookup of a post should only require the postid, lookup of an actor should only require the actor id
 trwnh | and honestly taking it one step further, if you want to let people make one account and manage multiple profiles, then you should also abstract away users from actors
 trwnh | i.e. an account is a set of credentials (email/pass/2fa) and a profile is a /u/id
 trwnh | the good thing is, if you do that, you should be theoretically able to handle any future changes
 trwnh | doing those 3 things should result in eliminating most fragility that existing impls have
dansup | the problem with using /u/id is enumeration. we should migrate them to snowflake ids
 trwnh | dansup: it doesn't matter what the actual id is. if you're concerned about enumeration then change them to whatever. snowflake, 128bit, 256bit, hash of first public key, whatever
 trwnh | maybe that means doing some database migrations, sure
 trwnh | the only things you have to worry about are the things that are publicly exposed
 trwnh | e.g. if you never expose an API for integer ids then enumeration isn't really a concern, is it? 
 trwnh | but yeah snowflake/hash/whatever for all object ids. actor, activity, note, image
dansup | yeah

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants