Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added storage_max_events option, updated pg_storage event loading #2

Open
wants to merge 5 commits into
base: postgre-strorage
Choose a base branch
from

Conversation

johnarnold
Copy link

This option allows limiting the max number of events loaded from the storage provider (i.e. postgresql). Thus, loading performance will be deterministic regardless of how big the table gets.

@@ -27,7 +27,14 @@

_add_event = """INSERT INTO events (time, data) VALUES (%s, %s) ON CONFLICT DO NOTHING"""

_all_events = """SELECT data FROM events ORDER BY time ASC"""
_get_events = """SELECT data FROM (
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnarnold My SQL skills are a bit rusty... Does this query work as expected (return all events) when max_events is None?

Copy link
Owner

@bosonogi bosonogi Feb 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnarnold Limiting by a number events can result in an incomplete task load at the point where loading is cut off. I've just tested this query for loading the events for the last x tasks:

SELECT data FROM events WHERE data->'uuid' IN (
  SELECT data->'uuid' FROM events
  WHERE data @> '{"type": "task-received"}'
  ORDER BY time DESC LIMIT 4
)
ORDER BY time ASC

Copy link
Author

@johnarnold johnarnold Feb 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_events should never be none, as it's defaulted to 30000 (i.e. 3x the default max_tasks)

Also, it's loading the newest events, so while you could get a partial task load, it will be the oldest task(s) that are incomplete. Does that cause any problems? I don't see a way to avoid it, other than keeping 'all events forever' and that doesn't seem doable or useful either.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought was that if you load enough or slightly more events than are needed to populate the task cache up to it's max_tasks size, any partial tasks will get dropped out of the cache

Copy link
Author

@johnarnold johnarnold Feb 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have this running in production, btw. My ratios might be slightly off --
storage_max_events = 300k yielded ~ 75k tasks, whereas my max_tasks = 100k. So the storage_max_events should probably be around 4x the max_tasks.

but it does work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure a contains-type query against the GIN index will get us the performance we want. I'm not a pSQL expert so maybe there's a query plan optimization we're missing. My head goes towards making uuid it's own column and indexing it, so we can do a simpler distinct + order + limit query to build a list of n tasks, and fetch all their events.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnarnold Another approach might be to keep the query simple as it was, and move the limit logic to Python code. As we read all events from the database one at a time in descending order, we could detect when all events for max_tasks number of tasks have been loaded and stop reading. Then these events could be replayed in the correct (ascending) order.

This would involve loading the events to memory before using them, but by design that amount of data must be small enough to fit temporarily in memory without issues.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bosonogi That actually could be pretty straightforward, as a simple Set of uuids with a len() check against max_tasks would tell us when to stop. I'm a bit worried though that the 'get all events' query will be slow to return a cursor as the table grows. I'll do some testing. I think exposing uuid as a column may end up giving the desired perf and keep the query on the simple side.

As an aside, I'm seeing utilization on my postgresql server growing at a rate of about 1% per-hour until it flatlines (70% currently) -- I suspect the GIN index becomes troublesome to update with a very large table size. So we should either think about occasionally truncating the table, or ditching the GIN index. Give it some thought, and let me know if you see similar behavior?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting reading: ORDER BY kills gin index perf, forces b-tree index:
https://dba.stackexchange.com/questions/113132/gin-index-not-used-when-adding-order-clause

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting reading indeed. We do order by on the "time" column, so we might get away with it.

I'm not against adding one more column and index for UUID if it proves that would produce significant boost in performance. Note however that the UUID field is unfortunately named as such as it is possible to define any string as the task ID (and we do that in our system so that the task IDs would be more meaningful). This means you can't limit the column type with assumptions related to UUIDs.

@johnarnold
Copy link
Author

@bosonogi Take a look at the latest here, I implemented the uuid column, index, and max_tasks based query. So far so good in my testing.

Note: to test, you may want to drop your events table in postgresql and let it be recreated.

@@ -18,16 +18,40 @@
(
id SERIAL PRIMARY KEY,
time TIMESTAMP NOT NULL,
uuid VARCHAR(36) NOT NULL,
data JSONB NOT NULL,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could store as TEXT instead of JSONB, since we're not using a GIN index... maybe get some more perf by not validating the json?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropping the GIN index and the JSONB is not an option for me, as I already depend on those for some data mining that is separate from Flower -- and I expect there will more of that in the future.

@johnarnold
Copy link
Author

@bosonogi The latest changes fix up the loading performance. It's suitably fast now. However, I ran into another performance bottleneck -- write performance on the flower side.

I set up a simple test to spray tasks:

@app.task()
def emit_true():
    return True
class TestPerformance:

    def test_emit_true(self):
        for i in (range(1,100000)):
            net_task.emit_true.delay()

The result is that flower basically locks up when i run the test. I haven't dug into the event callback and Tornado enough to understand how it's working yet, but it seems like it's just pegging one thread. Is there a way around this, are we hitting sing-e-thread perf due to the global connection object?

More digging needed, just thought I'd throw it out there before I call it a night.

@bosonogi
Copy link
Owner

Database inserts could be queued up from the main thread in a queue.Queue and executed in a separate worker thread. This would avoid blocking up the main thread while still eventually filling up the database in the correct order.

@@ -18,16 +22,40 @@
(
id SERIAL PRIMARY KEY,
time TIMESTAMP NOT NULL,
data JSONB NOT NULL,
uuid VARCHAR(36) NOT NULL,
data TEXT NOT NULL,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnarnold Dropping the GIN index and the JSONB is not an option for me, as I already depend on those for some data mining that is separate from Flower -- and I expect there will more of that in the future.

Copy link
Author

@johnarnold johnarnold Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bosonogi Would you need the index if the table schema was expanded? Are there specific columns you need?

JSONB is probably fine (it's about a 20-30% perf hit and a 200% increase in table size) but the GIN index calculating is pretty brutal as you get to large event table sizes. I can profile it some more for specifics after I figure out the write queueing.

Copy link
Owner

@bosonogi bosonogi Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnarnold Well, the idea is to be able to access any fields from events in queries. While we don't need access to all of them right now, we do not know which of them we might need in the future.

Inserting in a separate thread should help with the increased computation cost. I've made a pull request to your branch with some quick changes (johnarnold#4) to possibly help things move along -- and to be a bit more involved ;) I hope I will have time to do more than just suggest untested code in the future.

@@ -18,16 +22,40 @@
(
id SERIAL PRIMARY KEY,
time TIMESTAMP NOT NULL,
data JSONB NOT NULL,
uuid VARCHAR(36) NOT NULL,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnarnold 36 characters is too few... I've said in a previous comment that these are not always uuids but can be arbitrary strings -- which can easily be longer than 36 characters. As an example, the current maximum length of a task id in our database is 129 characters.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bosonogi Just letting you know I've been slammed with other work lately so I haven't gotten back to this but I will try. For our environment it seems like our volume is stressful to PostgreSQL, so I may write up a very simple redis provider and contribute into your PR.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnarnold cool; work situation is similar at my end.

An idea for an alternative approach has been brewing that would sidestep Flower (and its in-memory architecture). As I only use a small subset of what Flower does (start tasks and look up task details using its HTTP API), it seems to me it should be easy to make a Django Rest Framework app that does this -- utilizing the event data we already use/have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants