Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse Server becomes very slow and created tons of duplicate Session objects #4210

Closed
eeallen1 opened this issue Sep 25, 2017 · 10 comments
Closed

Comments

@eeallen1
Copy link

eeallen1 commented Sep 25, 2017

Issue Description

I have an issue where my Parse Server becomes unusably slow, and the only clues I have are that multiple Session objects for existing users are being created, and the error logs just show a repeated 'error: invalid session token code=209, message=invalid session token' over and over again.

There are sometimes as many as 20 Sessions created for a single user in under a minute, always with duplicate info. It seems to happen when users are logging in, as the 'createdWith' field always has login instead of singup.

Other errors in my app also start happening, and I'm not sure if they're related to this one, caused by it, or the cause of it. Namely, account creation keeps failing with the error "The data couldn’t be read because it isn’t in the correct format" when I try to sign up a user with email and password. This has never happened except at the same time the session token weirdness happens. These accounts that are failing to create are not the same ones linked to the duplicate session tokens.

Steps to reproduce

This has only happened recently, and I updated to the latest version of Parse Serve, but the issue happens before and after the update. The only thing I know for sure is that when my server slows down, I get a lot of invalid session token errors, a lot o duplicate Session objects, and a lot of failed account creations (even though if I check, the User objects have been saved in Parse).

I had about 100 concurrent users on my app when this happened last.

I'm happy to provide more detail if there's something specific that can help.

Actual Outcome

All instances of my Parse Server become very slow, and nearly unresponsive. The droplets they're running on are working fine, but the server response time is incredibly slow and unusable. It doesn't matter how many different hosts I use with the load balancer, it's all the same result.

When I run parse server and dashboard locally with no clients connected, it works fine.

Environment Setup

  • Server

    • parse-server version (Be specific! Don't say 'latest'.) : [FILL THIS OUT]
    • Operating System: Ubuntu 16.04.2 LTS x64
    • Hardware: 8 GB Memory / 40 GB Disk /
    • Localhost or remote server? Multiple digital ocean droplets with nginx as a load balancer
  • Database

    • MongoDB version: [3.0.12]
    • Localhost or remote server? mLab

Logs/Trace

I mostly just get a lot of these:

error: invalid session token code=209, message=invalid session token
error: invalid session token code=209, message=invalid session token
error: invalid session token code=209, message=invalid session token

When I turn on verbose, it becomes overwhelming with all of the request data, but the errors don't provide any more info. I'm happy to share the verbose logs if it could help.

@flovilmart
Copy link
Contributor

@eeallen1 do you have any indexes on your database? Because that may explain the problem if you're missing indexes on those hot paths.

@eeallen1
Copy link
Author

@flovilmart I usually build the recommended indexes in mLab, so I assume most of them are there. I'll go add any that are missing and report back.

@flovilmart
Copy link
Contributor

ALso, you mention you have many sessions created for a single user, this should be fixed now and have only a single session per user/installationId pair. You should check you client code, making sure you don’t have a bug calling too many times ‘login’

@eeallen1
Copy link
Author

@flovilmart is there any way to get more info about the invalid session token error message? Like who the user is, which session object it is, or anything like that?

My app seems to be working normally now after I just resized my nginx server, but I'm not convinced that it had anything to do with the problem, as this same problem happened 5 days ago and mysteriously stopped a few hours later. I'm still getting a ton of invalid session token errors, and I'm not really sure why. The duplicate sessions have stopped popping up, however, and the user signup errors have also stopped for now.

I think the duplicate sessions and the signup account errors might have something to do with requests being too slow/timing out? I'm not sure what's causing the slowdown in the first place though. Could it have been as simple as too many inbound requests to my load balancer? I'm curious to know where the invalid sessions could be coming from.

@flovilmart
Copy link
Contributor

You should be able to log, with settings VERBOSE=1 all traffic coming in and out. otherwise at nginx level you could also log the request headers / responses and help identify the issue. For the rest, I'm not sure, does you CPU / memory levels are high on the node instance etc...

@eeallen1
Copy link
Author

@flovilmart The CPU/memory usage is typically under 30% for the machines running my node instances as well as the machine with nginx.

I'm thoroughly stumped by what's going on here. It's hard for me to track down which requests are causing the invalid session errors, as verbose logging produces an unwieldy amount of logs, and as far as I know, there's no way to tell which requests are producing the errors.

My nginx logs are producing lots of lines like this:

2017/09/26 17:35:44 [error] 1647#0: *753314 connect() failed (111: Connection refused) while connecting to upstream, client: client-ip, server: my.nginx.server, request: "POST /parse/classes/Brand HTTP/1.1", upstream: "http://my.parse.server:1337/parse/classes/Brand", host: "my.nginx.server"
2017/09/26 17:35:45 [error] 1646#0: *708648 connect() failed (111: Connection refused) while connecting to upstream, client: client-ip, server: my.nginx.server, request: "POST /parse/classes/Dress/D0A41ErAJQ HTTP/1.1", upstream: "http://my.parse.server:1337/parse/classes/Dress/D0A41ErAJQ", host: "my.nginx.server"

But I'm not sure how to get info from the request/response to go into more detail.
Most of the problems I mentioned before have disappeared for now, but I'm still getting really high rates of the invalid session token error, and they always come in batches. I won't see them for a few minutes, and then there are 50 consecutive errors all at once.

To give you an idea, here's the new relic error rates. Literally all of the errors here are the invalid session tokens
screen shot 2017-09-26 at 12 01 13 pm

Do you know of a way for me to tell exactly which requests are producing the errors? I'm concerned that I still don't know where the problem was coming from in the first place, and why the server slowdown and duplicate sesions have stopped, but the session token errors still exist.

I think my client code may be signing up users, and then immediately saving them, which seems to produce a rare "Account already exists for this username" error. Does that, or successive sign in attempts explain any of this other behavior?

@dongalor
Copy link

dongalor commented Oct 30, 2017

i'm having same issue:

error: invalid session token code=209, message=invalid session token

Client (which is IOS application, not using parse IOS sdk, but instead we have cloud functions on backend for all operations). So after killing application it logs in again and getting sessionToken which can't be found in database :(

Also, i've updated parse-server to latest 2.6.5, still having the same issue

@Rioner123
Copy link

Hi all,

were you able to figure out these issues?

@montymxb
Copy link
Contributor

Could this be related to the single schema cache issue, #4247?

@eeallen1
Copy link
Author

So I finally figured out what was going on here, and it may or may not be the same cause for people experiencing similar issues.

I use MLab for my hosted MongoDB, and if you use their M1 or M2 plan, then they're hosted on AWS T2 instances, which have something call "burst credits" that limits how much CPU you can utilize. Of course, as an end user of MLab, this limit was all but invisible to me, so my issues in this thread stemmed from high CPU utilization which exhausted the AWS burst credits, and that's why the slowdowns seem so random.

If anyone using MLab is experiencing a similar issue, then your CPU usage is probably too high from a suboptimally indexed query that's using an in-memory sort. You can either rewrite/reindex your queries to be more efficient, upgrade your MLab instance, or both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants