Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate Key Error (maybe not related with Fastapi-user but help needed) #349

Closed
MariusMez opened this issue Sep 29, 2020 · 19 comments
Closed
Labels
question Further information is requested

Comments

@MariusMez
Copy link

Hi @frankie567

Sometimes in a "production" application using fastapi-user (with Gunicorn and uvicorn and concurrency), I have this error I'm not unable to reproduce by myself in dev mode.

It occurs sometime at registration when two users want to register in the same two minutes.
I don't know if it's related to the use of Fastapi-user but maybe I can find some help here to understand this error :-)

In fact this UUID mentioned in the error below appear to be the uuid of the previous saved user. I'm wondering if it's not a concurrency issue with gunicorn?

DuplicateKeyError: E11000 duplicate key error collection: myproject_db.users index: id_1 dup key: { id: UUID("c8ff467d-d490-4e32-b0b6-21ac94905dd5") }, full error: {'index': 0, 'code': 11000, 'keyPattern': {'id': 1}, 'keyValue': {'id': UUID('c8ff467d-d490-4e32-b0b6-21ac94905dd5')}, 'errmsg': 'E11000 duplicate key error collection: myproject_db.users index: id_1 dup key: { id: UUID("c8ff467d-d490-4e32-b0b6-21ac94905dd5") }'}
File "myprojectapp/core/proxy_headers_middleware.py", line 43, in call
return await self.app(scope, receive, send)
File "starlette/middleware/cors.py", line 86, in call
await self.simple_response(scope, receive, send, request_headers=headers)
File "starlette/middleware/cors.py", line 142, in simple_response
await self.app(scope, receive, send)
File "starlette/exceptions.py", line 82, in call
raise exc from None
File "starlette/exceptions.py", line 71, in call
await self.app(scope, receive, sender)
File "starlette/routing.py", line 566, in call
await route.handle(scope, receive, send)
File "starlette/routing.py", line 227, in handle
await self.app(scope, receive, send)
File "starlette/routing.py", line 41, in app
response = await func(request)
File "fastapi/routing.py", line 183, in app
dependant=dependant, values=values, is_coroutine=is_coroutine
File "fastapi/routing.py", line 133, in run_endpoint_function
return await dependant.call(**values)
File "fastapi_users/router/register.py", line 38, in register
created_user = await user_db.create(db_user)
File "fastapi_users/db/mongodb.py", line 64, in create
await self.collection.insert_one(user.dict())
File "concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "pymongo/collection.py", line 701, in insert_one
session=session),
File "pymongo/collection.py", line 615, in _insert
bypass_doc_val, session)
File "pymongo/collection.py", line 603, in _insert_one
acknowledged, _insert_command, session)
File "pymongo/mongo_client.py", line 1498, in _retryable_write
return self._retry_with_session(retryable, func, s, None)
File "pymongo/mongo_client.py", line 1384, in _retry_with_session
return self._retry_internal(retryable, func, session, bulk)
File "pymongo/mongo_client.py", line 1416, in _retry_internal
return func(session, sock_info, retryable)
File "pymongo/collection.py", line 600, in _insert_command
_check_write_command_response(result)
File "pymongo/helpers.py", line 230, in _check_write_command_response
_raise_last_write_error(write_errors)
File "pymongo/helpers.py", line 211, in _raise_last_write_error
raise DuplicateKeyError(error.get("errmsg"), 11000, error)

@MariusMez
Copy link
Author

Also a precision concerning the uuid use, here is my User class, I set uuid here because I need to have it in str format in the field uuid_str (and I didn't find a way to do it without updating the object in an other query after registration by getting the value set with the default validator in models.BaseUser)

class User(models.BaseUser):
    id: Optional[UUID4] = uuid.uuid4()
    uuid_str: Optional[str] = str(id)
[...]

@frankie567 frankie567 added the question Further information is requested label Sep 30, 2020
@frankie567
Copy link
Member

Hi @MariusMez!

I think there is something wrong with your model here. The way I see it, you generate the UUID when the source code is loaded, and never after. A bit like when we mistakenly use mutable default arguments (see https://docs.python-guide.org/writing/gotchas/). Simple reproducible example:

import uuid
from pydantic import BaseModel, UUID4

class User(BaseModel):
  id: UUID4 = uuid.uuid4()

u1 = User()
u2 = User()

print(u1.id, u2.id)  # UUID are the same

I think it works most of the time because Gunicorn should spawn a new process, and thus the code is reloaded with a new value. However, when there is concurrency, the same process is reused and they end up using the same id.

To circumvent this, you should feed the value at runtime. Validators in Pydantic are here for this purpose. So your model could look like:

from typing import Optional

from fastapi_users import models
from pydantic import root_validator


class User(models.BaseUser):
    uuid_str: Optional[str] = None
    
    # root_validator allows to get the values of other fields once they are all validated
    @root_validator
    def stringify_uuid(cls, values):
        id = values.get("id")
        values["uuid_str"] = str(id)
        return values

@MariusMez
Copy link
Author

Thanks @frankie567 for your answer and your explanation about the Pydantic validators I didn't use correctly. Much appreciated 👍

I have to dig more on this, because now I have the same error but for the email field (wich I didn't redefine in my User model )

DuplicateKeyError: E11000 duplicate key error collection: myproject_db.users index: email_1 dup key: { email: "email@gmail.com" }, full error: {'index': 0, 'code': 11000, 'keyPattern': {'email': 1}, 'keyValue': {'email': 'email@gmail.com'}, 'errmsg': 'E11000 duplicate key error collection: myproject_db.users index: email_1 dup key: { email: "email@gmail.com" }'}

@frankie567
Copy link
Member

You're welcome!

Hmm, that's weird indeed because existing e-mail should be checked in register route before trying to inserting a new user. Did you override the register route in some way? Or did you override the email_collation parameter in the DB adapter?

@MariusMez
Copy link
Author

MariusMez commented Oct 2, 2020

Yes very strange as I can't reproduce the problem on my computer (with a local mongo) using the same launch command with gunicorn workers. Everything is working great

In fact I "solved the problem" by deleting one index on my mongo prod instance (a cluster in Mongodb cloud atlas). The one related to unique email (I didn't delete the one on case_insensitive email).

Deleting this unique email field index solve the error and the registring process goes well (no duplicate, email existing checking case_insensitive is ok)

So for the moment I can close this ticket, but I'm wondering what is the real issue behind this...

And to answer your questions, I didn't override the register route nor the email_collation parameter.

@frankie567
Copy link
Member

frankie567 commented Oct 2, 2020

Hmm... There is something wrong with those indices ; I'll try to dig more into this.

@MariusMez
Copy link
Author

And of course when I restart the application, the missing indexes are newly created and the problem re-appear :D

@frankie567
Copy link
Member

Yeah, sure they are defined here:

https://github.com/frankie567/fastapi-users/blob/97cc799737d8cd8d9207c04afd37a5b90fb9eb03/fastapi_users/db/mongodb.py#L22-L42

The way it should work:

  • There is a unique index on email ; so we shouldn't be able to insert email@gmail.com twice.
  • If a guy tries to register with email@gmail.com or even EMAIL@gmail.com, the case-insensitive index should tell us that this email already exists and should raise an HTTP error before trying to insert this one.

So, theoretically, this should work.

Can you reproduce it locally? If so, could you share the steps you are following?

@MariusMez
Copy link
Author

MariusMez commented Oct 2, 2020

Locally, it's working perfectly.

But with a "prod Mongo", the problem appears every time but only after two registrations. And if I delete the last user in the database, it allow two more registrations before the DuplicateKey error. (I tried with more than 1 gunicorn workers, and also directly with unicorn and 1 worker, the same)

When I delete the unique index on email it works again.

I test to delete this unique email index also locally and after that the email checking (case-insensitive) is working well. So maybe this unique email index is redondant with the case_insensitive_email_index wich seems to be good alone.

@frankie567
Copy link
Member

So, if I understand correctly:

  1. A guy comes and registers with "email@gmail.com" ; all is good.
  2. Another guy tries to register with the same email ; and the error is raised.

Is that it?

@MariusMez
Copy link
Author

MariusMez commented Oct 2, 2020

Not really, initially the issue raise in this case:

  1. A guy comes and registers with "email@gmail.com" ; all is good.
  2. An other guy comes and registers with "email_2@gmail.com" ; all is good.
  3. A third guy comes and registers with "email_3@gmail.com" --> Error Duplicate Key

--> Before removing the unique email index, to allow the third guy to register I have to restart the application and it will go for two more registrations before the error, or deleting an entry in the user collection.

For the workflow I didn't test with a 1bis. guy trying to register twice with "email@gmail.com" but the email checking was working.

@frankie567
Copy link
Member

🤯

@MariusMez
Copy link
Author

And of course for my local mongodb, no problem.

The fix I found and which seems good for now is to remove the unique email index (only this one).

@frankie567
Copy link
Member

Random thought: you said you were on an Atlas cluster ; are you using the right connection URL that handle clusters correctly ? In that form mongodb+srv://<username>:<password>@<server>.mongodb.net/<dbname>?retryWrites=true&w=majority

@MariusMez
Copy link
Author

MariusMez commented Oct 2, 2020

Yes but without specifying the dbname

@frankie567
Copy link
Member

I suspect there are some gotchas with the cluster/sharded nature of Atlas servers. There are a number of restrictions about this in the MongoDB doc I can't fully understand : https://docs.mongodb.com/manual/core/index-unique/

What you can do to circumvent this while we figure this out is to override the MongoDBUserDatabase constructor to drop the index:

class MongoDBUserDatabaseWithoutIndex(MongoDBUserDatabase):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.collection.drop_index("email_1")

@MariusMez
Copy link
Author

Thanks for the temporary fix, like you I don't understand well the differences

@MariusMez
Copy link
Author

FYI I've updated my mongo Atlas Cluster version from 4.2 to 4.4 and it seems the problem is gone... All is working OK with all the indexes.

@frankie567 Sorry for the headache and thanks for the Pydantic tips ! I let you close this issue ;-)

@frankie567
Copy link
Member

Glad to hear it! I'm also on an Atlas Cluster in 4.2 ; I didn't have this issue yet, fingers crossed 🤞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants