Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Corruption vulnerability on TinyDB version 4.7.1 #529

Closed
deadoverflow opened this issue Jun 5, 2023 · 5 comments
Closed

Data Corruption vulnerability on TinyDB version 4.7.1 #529

deadoverflow opened this issue Jun 5, 2023 · 5 comments

Comments

@deadoverflow
Copy link

I have discovered a weird form of vulnerability on TinyDB version 4.7.1 that leads to data corruption on the referenced database. Here is the summary:

So I have created a simple flask project to showcase this vulnerability a little bit better and also to present it's attack surface. Here is the backend code for the application:

from flask import Flask 
import tinydb

app = Flask(__name__)
id_ = 0
@app.route('/<data>')
def Insert(data):
    global id_
    db = tinydb.TinyDB('data.json')
    db.insert(
        {
            'id': id_,
            'data': data
        }
    )
    id_ += 1
    return {'id': id_-1, 'data': data}

@app.route('/update/<id>/<data>')
def Update(id, data):
    db = tinydb.TinyDB('data.json')
    node = db.search(tinydb.Query().id == int(id))
    if node == []:
        return "no such node with that id"
    else:
        node[0]['data'] = data
        db.update(node[0], tinydb.Query().id == int(id))
        return node[0]

@app.route('/')
def Home():
    db = tinydb.TinyDB('data.json')
    return db.all()

app.run(port=80)

As you can see there are 3 views here and I've implemented basic functionality here that is good for this showcase.
View Update() is interesting because it takes id and data as a parameter,

@app.route('/update/<id>/<data>')
def Update(id, data):

then it preforms a db lookup for a node whose id is matching the one supplied

db = tinydb.TinyDB('data.json')
node = db.search(tinydb.Query().id == int(id))

and finally if there is a node in the db that matches this criteria, the node will be updated with a new data string for it's key "data".

if node == []:
        return "no such node with that id"
    else:
        node[0]['data'] = data
        db.update(node[0], tinydb.Query().id == int(id))

The vulnerability arises when an attacker tries to update a node way to fast.
If an attacker sends few requests to update the node with it's corresponding id within a second or 2 the last byte of the database, in this case, "data.json" would be replaced by a NULL byte thus triggering an internal server error and the web application is unusable from that point on.

image

Here is how the database looks like now:
image

This issue could be related to a race condition as the previous request by the attacker triggered an update but it wasn't completed yet and the attacker send a second request already which caused another update on top of unfinished one. That is just my theory on why this happens and I possibly could be wrong about why this happens. Regardless, here is the code that i used for the attack;


import requests
import threading
import random


def attack():
    for i in range(15415155):
        r = requests.get('http://localhost/update/0/' + str(random.randint(1000, 99999999)))
        print(i)

t = []

for i in range(50):
    p = threading.Thread(target=attack)
    p.start()

You can just run it for a 1 or 2 seconds and it is enough to break the db.

image

As seen in the application logs, it only took around 8 requests to corrupt the last byte of the db.

I hope you will find this report useful!

Best regards,

deadoverflow

@MrPigss
Copy link
Contributor

MrPigss commented Jun 7, 2023

Using multi- processing/threading is not natively supported by TinyDB. It's not specifically mentioned in the docs but it does list it as a reason not to use TinyDB (see here).

It is perfectly possible to use TinyDB with multi- processing/threading but you'll need to put locks in the appropriate places.

In other words this is not a bug, but maybe it should be mentioned in the docs more clearly.

Edit:
I see you use app.run(), this shouldn't be using more than one thread or process (unless the threaded or processes params are set). In short, this should work. Did you use something like gunicorn, uvicorn, ... when running this test? If not then this might indeed be a bug.

@deadoverflow
Copy link
Author

deadoverflow commented Jun 7, 2023

I just ran app.run() without using gunicorn or anything like that, i just straight up ran python app.py.

In theory, yes this was not mentioned in the docs which means developers are going to use tinydb in their applications and when deploying to for example namecheaps hosting service, it takes passenger_wsgi.py file which just imports the application from its respective server file and preforms app.run(). So it only starts one process which is what i initially tested.

app.run() will in fact start only one process so yeah i think this could be a bug but not entirely sure about it either

If you want me to upload folder where are all the files that i used for testing i will gladly do so.

@deadoverflow
Copy link
Author

deadoverflow commented Jun 7, 2023

Well, when two users send each a HTTP request to the flask server, the server would handle them asynchronously, meaning this opens up for a multi threading. One thread (or request) is trying to preform a write to the db while second thread just started and is behind first thread for 0.02 seconds for example. Than this race condition somehow corrupts the last byte of the db.

This could not be a bug but definitely shall be mentioned in the documentation of the tinydb so that developers can effectively take action to avoid this behavior which causes disruption of service in the end. Especially developers who are trying to implement tinydb on their web applications will find this information crucial as without it, their website is prone to easy DOS with no questions asked!

@deadoverflow
Copy link
Author

i see that you have updated the docs for tinydb v4.8.0 i think. The "Why not use TinyDB" section implies that if you want to use tinydb in HTTP server environment, you should probably look for better solutions. I think this sums it up!

@msiemens
Copy link
Owner

msiemens commented Jul 24, 2023

i see that you have updated the docs for tinydb v4.8.0 i think.

I just looked this up and this sentence has been there since at least v3.0.0 (back in 2015!) 🙂

The point about Flask is correct though. Back when I developed TinyDB, Flask did NOT have multithreading. So it never crossed my mind that using TinyDB with Flask could cause any problems. But since v1.0 in 2018, Flask decided to turn on multithreading by default. I've now updated the documentation to explicitly mention this fact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants