Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOCS: Document cookie usage in Etherpad - Waiting on Final Edits #3563

Closed
2 tasks done
tiblu opened this issue Feb 25, 2019 · 22 comments · Fixed by #3921
Closed
2 tasks done

DOCS: Document cookie usage in Etherpad - Waiting on Final Edits #3563

tiblu opened this issue Feb 25, 2019 · 22 comments · Fixed by #3921
Assignees
Milestone

Comments

@tiblu
Copy link
Contributor

@tiblu tiblu commented Feb 25, 2019

Overview

It's important to the Users and the host of the Etherpad instance to know what cookies are used in Etherpad and what is their purpose.

List of known cookies issued by Ethepad

name sample value domain path expires/max-age http secure usage
express_sid s%3A7yCNjRmTW8ylGQ53I2IhOwYF9... example.org / 1969-12-31T23:59:59.000Z true true Session ID of the Express web framework. When Etherpad is behind a reverse proxy, and an administrator wants to use session stickiness, he may use this cookie. If you are behind a reverse proxy, please remember to set trustProxy: true in settings.json
io uUrtV8P-cJF0IVDOAAAV example.org / 1969-12-31T23:59:59.000Z true false No longer used since Etherpad 1.8, see a51684b
language en example.org / 1969-12-31T23:59:59.000Z false true The language of the UI (e.g.: en-GB, it)
prefs %7B%22epThemesExtTheme%22... example.org /p 3000-02-25T13:17:08.000Z false true client-side preferences (e.g.: font family, chat always visible, show authorship colors, ...)
token t.tFzkihhhBf4xKEpCK3PU example.org / 2019-04-26T13:17:07.000Z false true A random token representing the author, of the form t.randomstring_of_lenght_20. The random string is generated by the client, at (pad.js#L55-L66). This cookie is always set by the client (at pad.js#L153-L158) without any solicitation from the server. It is used for all the pads accessed via the web UI (not used for the HTTP API). On the server side, its value is accessed at SecurityManager.js#L33.
sessionID s.1c70968b333b25476a2c7bdd0e0bed17 example.org / 2019-04-26T13:17:07.000Z ? ? Sessions can be created between a group and an author. This allows an author to access more than one group. The sessionID will be set as a cookie to the client and is valid until a certain date. The session cookie can also contain multiple comma-separated sessionIDs, allowing a user to edit pads in different groups at the same time. More info - https://github.com/ether/etherpad-lite/blob/develop/doc/api/http_api.md#session

|

TODO

  • Extend the cookie list above
  • Add usage clarifications to every cookie
@muxator
Copy link
Contributor

@muxator muxator commented Feb 25, 2019

That's right.
A PR with the needed changes in the documentation would be promptly accepted.

BTW, this table is going to be updated as soon as #3561 is implemented: we will need to explain that the secure flag is dependent whether or not Etherpad is accessed via TLS.

@tiblu
Copy link
Contributor Author

@tiblu tiblu commented Feb 25, 2019

@muxator Thanks for the quick response.

I can PR the documentation change once the information has been collected but before that I would need the information to be correct.

Can you comment on any of the cookies mentioned above?

@muxator muxator added this to the 1.8 milestone Mar 27, 2019
@HamzaKhait
Copy link

@HamzaKhait HamzaKhait commented May 22, 2019

Hello,
Is there any way to completely disable these cookies ? I don't think they're useful

@muxator
Copy link
Contributor

@muxator muxator commented Dec 7, 2019

@tiblu, I finally had the time to work on #3561 and thus simplified & clarified the scope of the cookies.

The perfs cookie holds the client-side settings (e.g.: font family, chat always visible, show authorship colors, ...), but I did not have time to thoroughly assess the code like for the other ones.

@muxator
Copy link
Contributor

@muxator muxator commented Dec 7, 2019

Added another optional cookie, sessionID, that can be used with the HTTP API. Details are scattered throughout the http_api.md documentation and the source code, for example in src/node/padaccess.js and src/static/js/pad.js.

@muxator
Copy link
Contributor

@muxator muxator commented Dec 7, 2019

Postponing the release of this documentation to a point release after 1.8.0.

@muxator muxator removed this from the 1.8 milestone Dec 7, 2019
@muxator muxator added this to the 1.8.1 milestone Dec 7, 2019
@tiblu
Copy link
Contributor Author

@tiblu tiblu commented Apr 13, 2020

@muxator Thanks for the input on this, from our POV we have all the info on all the cookies we're using in our Etherpad deployment.

One thought tho, as privacy is important and GDPR requires to have very granular cookie consents which describe to the User the use of cookies we should have a place in documentation that holds the up to date info about the cookies? I think this GH issue is a good start, but MAY be hard to find and get out of date.

Thanks again for all your work on Etherpad.

@tiblu
Copy link
Contributor Author

@tiblu tiblu commented Apr 13, 2020

Now that I think about it, the token is one which usage is actually not all that clear - how it actually is used, when is it actually needed. Even when public pads are disabled, it's still generated while the claim is "used for public pads".

@JohnMcLear
Copy link
Member

@JohnMcLear JohnMcLear commented Apr 14, 2020

TLDR; token is an abstraction of authorID so we don't have to always pass authorID? :)

grep -rni "token" .

Server side:

./hooks/express/importexport.js:79:      if (!req.cookies.token) {
./hooks/express/importexport.js:80:        console.warn(`Unable to import file into "${req.params.pad}". No token in the cookies`);
./hooks/express/importexport.js:84:      let author = await authorManager.getAuthor4Token(req.cookies.token);
./hooks/express/importexport.js:87:        console.warn(`Unable to import file into "${req.params.pad}". No Author found for token ${req.cookies.token}`);
./hooks/express/openapi.js:232:    checkToken: {
./hooks/express/openapi.js:233:      operationId: 'checkToken',
./hooks/express/openapi.js:234:      summary: 'returns ok when the current api token is valid',
./db/SecurityManager.js:33: * @param token the token of the author (randomly generated at client side, used for public pads)
./db/SecurityManager.js:37:exports.checkAccess = async function(padID, sessionCookie, token, password)
./db/SecurityManager.js:47:  var deniedByHook = hooks.callAll("onAccessCheck", {'padID': padID, 'password': password, 'token': token, 'sessionCookie': sessionCookie}).indexOf(false) > -1;
./db/SecurityManager.js:52:  // start to get author for this token
./db/SecurityManager.js:53:  let p_tokenAuthor = authorManager.getAuthor4Token(token);
./db/SecurityManager.js:70:      let authorID = await p_tokenAuthor;
./db/SecurityManager.js:85:      // grant access, with author of token
./db/SecurityManager.js:239:    let authorID = await p_tokenAuthor;
./db/SecurityManager.js:245:      // --> grant access, with author of token
./db/SecurityManager.js:252:      // --> grant access, with author of token
./db/AuthorManager.js:52: * Returns the AuthorID for a token.
./db/AuthorManager.js:53: * @param {String} token The token
./db/AuthorManager.js:55:exports.getAuthor4Token = async function(token)
./db/AuthorManager.js:57:  let author = await mapAuthorWithDBKey("token2author", token);
./db/AuthorManager.js:65: * @param {String} token The mapper
./db/AuthorManager.js:82: * so far this is token2author and mapper2author
./db/AuthorManager.js:95:    // create the token2author relation
./db/API.js:780:checkToken() returns ok when the current api token is valid
./db/API.js:787:exports.checkToken = async function()
./handler/PadMessageHandler.js:293:    let { accessStatus } = await securityManager.checkAccess(padId, auth.sessionID, auth.token, auth.password);
./handler/PadMessageHandler.js:859:    token : message.token,
./handler/PadMessageHandler.js:865: * Handles a CLIENT_READY. A CLIENT_READY is the first message from the client to the server. The Client sends his token
./handler/PadMessageHandler.js:873:  if (!message.token) {
./handler/PadMessageHandler.js:874:    messageLogger.warn("Dropped message, CLIENT_READY Message has no token!");
./handler/PadMessageHandler.js:905:  let statusObject = await securityManager.checkAccess(padIds.padId, message.sessionID, message.token, message.password);
./handler/SocketIORouter.js:93:        if (message.padId !== undefined && message.sessionID !== undefined && message.token !== undefined && message.password !== undefined) {
./handler/SocketIORouter.js:100:          let { accessStatus } = await securityManager.checkAccess(padId, message.sessionID, message.token, message.password);
./handler/APIHandler.js:89:  { "checkToken"                : []
./padaccess.js:6:    let accessObj = await securityManager.checkAccess(req.params.pad, req.cookies.sessionID, req.cookies.token, req.cookies.password);
./pad.js:153:  var token = readCookie("token");
./pad.js:154:  if (token == null)
./pad.js:156:    token = "t." + randomString();
./pad.js:157:    createCookie("token", token, 60);
./pad.js:169:    "token": token,
./timeslider.js:33:var token, padId, export_links;
./timeslider.js:48:    //ensure we have a token
./timeslider.js:49:    token = readCookie("token");
./timeslider.js:50:    if(token == null)
./timeslider.js:52:      token = "t." + randomString();
./timeslider.js:53:      createCookie("token", token, 60);
./timeslider.js:120:              "token": token,
./pad_impexp.js:139:  function importSuccessful(token)
./pad_impexp.js:146:        token: token,

Most notable:

/**
 * Returns the AuthorID for a token.
 * @param {String} token The token
 */
exports.getAuthor4Token = async function(token)
{
  let author = await mapAuthorWithDBKey("token2author", token);

  // return only the sub value authorID
  return author ? author.authorID : author;
}

and


/**
 * Returns the AuthorID for a mapper.
 * @param {String} token The mapper
 * @param {String} name The name of the author (optional)
 */
exports.createAuthorIfNotExistsFor = async function(authorMapper, name)
{
  let author = await mapAuthorWithDBKey("mapper2author", authorMapper);

  if (name) {
    // set the name of this author
    await exports.setAuthorName(author.authorID, name);
  }

  return author;
};

So token is a representation of an author.

My assumption would be.. session is dynamic, changes, doesn't matter. You need token to let Etherpad know you are who you say you are, type thing.. IE session isn't persistent between restarts, but token is.. It's not used for auth, it's used to say I am "John" who did these edits (but obv doesn't include the edits).

If token is leaked, it's nbd, it's not as if Etherpad would ever tell you what edits you did historically..

@JohnMcLear JohnMcLear changed the title DOCS: Document cookie usage in Etherpad DOCS: Document cookie usage in Etherpad - Waiting on Final Edits Apr 14, 2020
@tiblu
Copy link
Contributor Author

@tiblu tiblu commented Apr 14, 2020

Thanks for the quick answer @JohnMcLear!

If I understand correct there is 1:1 token to author mapping.
I do not understand why express_sid could not be a non-session cookie and there would be express_sid to author mapping without the token?

Sorry if I'm being slow here.

Also, I think it's very important for us to to explain cookies in a way that a visitor of Etherpad can understand, be it technical or non-technical. Last is the case for privacy control panels and their cookie settings.

@JohnMcLear
Copy link
Member

@JohnMcLear JohnMcLear commented Apr 14, 2020

express_sid isn't persistent afaik. etherpad was built before you could make it persistent.

That's afaik...

@tiblu
Copy link
Contributor Author

@tiblu tiblu commented Apr 14, 2020

Thanks alot! @muxator do you have anything to add to this?

@ukcb
Copy link

@ukcb ukcb commented Apr 19, 2020

Small hint: If you are already there, you could also replace the domain myinstance.com with example.org. This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.

@muxator
Copy link
Contributor

@muxator muxator commented Apr 19, 2020

replace the domain myinstance.com with example.org

Done. Thanks for the hint @ukcb.

@tiblu, could you make a PR for the documentation, so we can work on that one?

@tiblu
Copy link
Contributor Author

@tiblu tiblu commented Apr 20, 2020

@muxator Sure, I can PR this.

  • I feel like token explanation is vague. We discussed it with @JohnMcLear, in this thread, but not sure if we came to the right conlusions. 3rd opinion would help.
  • sessionID - that is not set as a cookie at all by EP itself? You call HTTP API createSession and get back the session id in the response. Then it's up to the API client to store the sessionID somewhere, somehow?
  • I propose putting the "Cookies" section between "Plugins" and "Database structure" in the documentation.

@muxator
Copy link
Contributor

@muxator muxator commented Apr 20, 2020

sessionID - that is not set as a cookie at all by EP itself? You call HTTP API createSession and get back the session id in the response. Then it's up to the API client to store the sessionID somewhere, somehow?

Yep. Not my design, but my conclusion is the same.
I re-did the whole tour of generating a session via command line. I still do not have an answer to your "somewhere", "somehow". I'll update this answer if I find something.

  1. call createAuthorIfNotExistsFor, obtain authorID:

    $ curl "http://localhost:9001/api/1/createAuthorIfNotExistsFor?apikey=XXX&name=muxator&authorMapper=7"
    {"code":0,"message":"ok","data":{"authorID":"a.qxYONNaUPu3zpYpS"}}
    
  2. call createGroupIfNotExistsFor, obtain groupID.

    $ curl "http://localhost:9001/api/1/createGroupIfNotExistsFor?apikey=XXX&groupMapper=7"
    {"code":0,"message":"ok","data":{"groupID":"g.dp3vV0WamTnpiOxc"}}
    
  3. call createSession, obtain a sessionID in the json payload that can be eventually used as cookie for another API call.
    TODO: WHICH ONE?

    $ curl --verbose "http://localhost:9001/api/1/createSession?apikey=XXX&groupID=g.dp3vV0WamTnpiOxc&authorID=a.qxYONNaUPu3zpYpS&validUntil=3333399999"
    > GET /api/1/createSession?apikey=XXX&groupID=g.dp3vV0WamTnpiOxc&authorID=a.qxYONNaUPu3zpYpS&validUntil=3333399999 HTTP/1.1
    [...]
    < HTTP/1.1 200 OK
    [...]
    < Set-Cookie: express_sid=YYY; Path=/; HttpOnly  <-- this makes no sense. Let's ignore for now. There is no sessionID cookie anyway
    < 
    {"code":0,"message":"ok","data":{"sessionID":"s.62a92b62fb2464ab3e43ef8dd1d2096a"}}
    

Give me some time to work out a feasible example of actually using sessionID as cookie. The API documentation is lacking.

@muxator
Copy link
Contributor

@muxator muxator commented Apr 20, 2020

I feel like token explanation is vague. We discussed it with JohnMcLear, in this thread, but not sure if we came to the right conclusions. 3rd opinion would help.

@tiblu, this are my thoughts after re-reading the table above and the code:

  • when the documentation says that token is used for public pads it is in error. That cookie is used, always, by the web interface (opposed to the HTTP API). You can see it looking here:

    var token = readCookie("token");
    if (token == null)
    {
    token = "t." + randomString();
    createCookie("token", token, 60);
    }

    This is the sendClientReady() function in static/js/pad.js, the browser executes it no matter what. If we converge on this interpretation, the table above (Edit: done) and the comment in the code have to be updated;

  • The initiative of setting token is taken by the client (again, see the snippet above). As such, it is an unverified assertion the browser makes about the identity of the human writing in the pad. Etherpad takes it as such. I am not wearing my security hat here. Given Etherpad's UX it may even make sense. I still did not change the description in the table because I want to know if you agree Edit: I have update the table mentioning that token is generated at client-side;

  • I have updated the table mentioning the actual spot in which the cookie is generated in the frontend code (the already mentioned pad.js#L153-L158).

p.s.: obviously, our aim here is only document what exists, without re-designing anything.

@tiblu
Copy link
Contributor Author

@tiblu tiblu commented Apr 21, 2020

Thanks again for quick and thorough response @muxator

p.s.: obviously, our aim here is only document what exists, without re-designing anything.
Sure thing.

I must admit I went down the rabbit hole and got lost in the layers. That said:

  • I agree, token is generated for all clients, used for all clients.
  • I agree, token is generated on client side. (
    var token = readCookie("token");
    if (token == null)
    {
    token = "t." + randomString();
    createCookie("token", token, 60);
    }
    )
  • token is only used in the context of Socket IO.`
  • 1 author can have ONE-TO-MANY token in the DB.
  • sessionID is actually used only for API access, I have no evidence that sessionID cookie is set somewhere in the code -
    * @param sessionCookie the session the user has (set via api)

To validate my understanding of the whole authorization/authentication flow this is how I THINK it goes:

Q: Thing I did not have time to dig out was how an Author is created and how an author is tied to a token.

@muxator
Copy link
Contributor

@muxator muxator commented Apr 22, 2020

@tiblu: I think we can open a PR and move on from there.

It's better if you open the PR, because I can push on top of that. You wouldn't be allowed to push on top of something started by me.

After it's done, I'll squash the modifications and pull in the PR.

Thanks

@tiblu
Copy link
Contributor Author

@tiblu tiblu commented Apr 22, 2020

@muxator Created the PR for docs - #3921

  • Could not find a way to test the docs generation. Checked for Travis files and package.json for documentation generation scripts, but could not find how docs are generated.
  • CI seems to fail due to some setup problems, not related to the change.
  • Sorry for the initial PR, it contained more changes than I wanted. Investigating why our fork has additional changes compared to Etherpads own "develop".

@tiblu
Copy link
Contributor Author

@tiblu tiblu commented Apr 22, 2020

@muxator OK, found out what changes are in our fork that are yet not merged to Etherpad "develop" - PR: #3791

muxator pushed a commit to tiblu/etherpad-lite that referenced this issue Apr 23, 2020
muxator pushed a commit to tiblu/etherpad-lite that referenced this issue Apr 24, 2020
muxator added a commit that referenced this issue Apr 24, 2020
"token" is a random token representing the author, of the form
t.randomstring_of_lenght_20. The random string is generated by the client. The
cookie is used for every pad in the web UI, and is not used for HTTP API.

This comes from the discussion at #3563
muxator pushed a commit to tiblu/etherpad-lite that referenced this issue Apr 24, 2020
muxator pushed a commit that referenced this issue Apr 24, 2020
tiblu added a commit to tiblu/etherpad-lite that referenced this issue Apr 27, 2020
@tiblu
Copy link
Contributor Author

@tiblu tiblu commented Apr 27, 2020

@muxator Looking good! Thanks for all your work on the project! Sorry for me missing the Makefile. Generation works like a charm.

Created a PR for table readability improvement - #3948

muxator pushed a commit to tiblu/etherpad-lite that referenced this issue May 3, 2020
This is a cosmetic fix for PR ether#3921 (109aa2d).
Discussion on ether#3563
muxator pushed a commit to tiblu/etherpad-lite that referenced this issue May 3, 2020
This is a cosmetic fix for PR ether#3921 (109aa2d).
Discussion on ether#3563
muxator pushed a commit that referenced this issue May 3, 2020
This is a cosmetic fix for PR #3921 (109aa2d).
Discussion on #3563
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants