Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feature): query dataset to return pandas df #93

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

Rambatino
Copy link

@Rambatino Rambatino commented Feb 13, 2024

Why

There's functionality in pandas that isn't possible, currently, using APL. Sometimes it makes sense to pull the data out and run more advanced algorithms on the raw data (e.g. ML) and therefore, have this functionality inside the library for the benefit of everyone and to ease adoption of the library and thus Axiom as a platform.

Functionality

This the first attempt at adding pandas dataframe functionality to axiom. It builds the dataframe up across multiple threads in order to increase the performance when dealing with large amounts of data.

With this PR I've also added tabular, as it's a far cleaner format to convert into a dataframe (as it's a matrix), it also handles nesting.

Switched most of the camelCase vars to snake_case inline with python best practices (this was harder than planned when using dacite)

Limitations

When querying large amounts of data > 10000 rows it's very slow. This is a limitation imposed by the API. This can be somewhat overcome by using axiom as it was intended, by filtering / aggregating using APL before querying.
The query must contain a "sort by _time asc" with the asc meaning that maxCursor is used

Example

ipdb> client.df('axiom-monitor-results',  datetime.now(UTC) - timedelta(minutes=60), datetime.now(UTC))
                              _sysTime                               _time alert_state brought_forward            check_id disabling  ... result                                run_id              run_time structured threshold trigger_after_consecutive_result
0  2024-02-13 15:10:55.190425983+00:00    2024-02-13 15:10:53.367964+00:00      closed           False  in9GwvOpvpozgc88dV     False  ...  False  2c64a96a-391a-494f-b2ad-bdf797d12fb1  2024-02-13T15:10:55Z      False         1                            False
1  2024-02-13 15:10:55.179746213+00:00    2024-02-13 15:10:54.407795+00:00      closed           False  GRl5S03BbqtJ23ARQz     False  ...  False  f2cca147-5248-48a5-a4b5-b25d0dd25e3f  2024-02-13T15:10:55Z      False         1                            False
2  2024-02-13 15:11:00.780976092+00:00 2024-02-13 15:11:00.319561522+00:00                       False                         False  ...  False  28929d4c-3880-42ed-9a86-6d5b8bae3aa5  2024-02-13T15:11:00Z      False       0.5                            False
3  2024-02-13 15:11:00.981409815+00:00 2024-02-13 15:11:00.319561542+00:00                       False                         False  ...  False  a0b9011e-10f3-49af-9fc7-d124b43e699b  2024-02-13T15:11:00Z      False      1000                            False
4  2024-02-13 15:11:00.851740355+00:00 2024-02-13 15:11:00.319567942+00:00                       False                         False  ...  False  d2a79456-e39d-4bd9-a0fe-6ec97603761a  2024-02-13T15:11:00Z      False      1000                            False
..                                 ...                                 ...         ...             ...                 ...       ...  ...    ...                                   ...                   ...        ...       ...                              ...
49 2024-02-13 15:49:10.188574630+00:00    2024-02-13 15:49:09.708098+00:00      closed           False  YoFXhZF0zNhjvsWdzL     False  ...  False  49fd2f6b-22f5-4d6b-838e-6cea37ca862c  2024-02-13T15:49:10Z      False     -1000                            False
50 2024-02-13 15:49:15.208177122+00:00    2024-02-13 15:49:10.553381+00:00      closed           False  oRzB2jXAK4E86OrVfG     False  ...  False  512a2d58-2cbe-4bf4-aa1a-d3372d48fb9a  2024-02-13T15:49:15Z      False      1000                            False
51 2024-02-13 15:49:15.203465039+00:00    2024-02-13 15:49:10.822038+00:00      closed           False  zBY8T8Mrd9QeWde7e4     False  ...  False  2620cb31-8fbf-4728-a8a3-614a7967f8c3  2024-02-13T15:49:15Z      False       0.5                            False
52 2024-02-13 15:49:15.177608139+00:00    2024-02-13 15:49:11.265719+00:00      closed           False  lCusSaHrH4w6W7ri4z     False  ...  False  9c91df64-3ac1-4f18-a766-453f50c52afa  2024-02-13T15:49:15Z      False       0.5                            False
53 2024-02-13 15:49:15.186986925+00:00    2024-02-13 15:49:11.667018+00:00      closed           False  7YJaqIHyBf3fufGq9M     False  ...  False  7f3d1bfe-41a0-4c13-ada3-c5be3a0cb1ae  2024-02-13T15:49:15Z      False      1000                            False

[3268 rows x 31 columns]
ipdb> client.df('axiom-monitor-results',  datetime.now(UTC) - timedelta(minutes=100), datetime.now(UTC), "where monitor_id == 'kLcWcc3t63W3peaFwH' | sort by _time asc")
                              _sysTime                               _time alert_state brought_forward            check_id disabling  ... result                                run_id              run_time structured threshold trigger_after_consecutive_result
0  2024-02-13 14:14:56.437307356+00:00    2024-02-13 14:14:53.367964+00:00      closed           False  in9GwvOpvpozgc88dV     False  ...  False  30e99f87-6d4f-4767-91cb-ad50b2a5cae2  2024-02-13T14:14:56Z      False         1                            False
0  2024-02-13 14:28:56.583982944+00:00    2024-02-13 14:28:53.367964+00:00      closed           False  in9GwvOpvpozgc88dV     False  ...  False  4b66d6d3-ea14-46e0-9d42-7eda5111836d  2024-02-13T14:28:56Z      False         1                            False
1  2024-02-13 14:29:11.546863642+00:00 2024-02-13 14:29:11.515838115+00:00                       False                         False  ...  False  408064ab-0361-4bc7-99ad-66450d83cf13  2024-02-13T14:29:11Z       True         1                            False
0  2024-02-13 14:17:56.516061247+00:00    2024-02-13 14:17:53.367964+00:00      closed           False  in9GwvOpvpozgc88dV     False  ...  False  cfabf6b2-c389-4b33-b12c-d7530353c4f7  2024-02-13T14:17:56Z      False         1                            False
0  2024-02-13 14:24:56.437592498+00:00    2024-02-13 14:24:53.367964+00:00      closed           False  in9GwvOpvpozgc88dV     False  ...  False  f9e1a83e-5ac8-4b6e-a97b-273329db0779  2024-02-13T14:24:56Z      False         1                            False
..                                 ...                                 ...         ...             ...                 ...       ...  ...    ...                                   ...                   ...        ...       ...                              ...
0  2024-02-13 15:46:55.242848869+00:00    2024-02-13 15:46:53.367964+00:00      closed           False  in9GwvOpvpozgc88dV     False  ...  False  c2a47ec1-2898-4fc4-959a-5f756f3eab29  2024-02-13T15:46:55Z      False         1                            False
0  2024-02-13 15:45:55.181259160+00:00    2024-02-13 15:45:53.367964+00:00      closed           False  in9GwvOpvpozgc88dV     False  ...  False  c1c30649-05bc-4efc-a5a0-525c275c5203  2024-02-13T15:45:55Z      False         1                            False
1  2024-02-13 15:46:00.404366699+00:00 2024-02-13 15:46:00.293446703+00:00                       False                         False  ...  False  2cdf6595-6213-4e92-bf17-b939b68833d2  2024-02-13T15:46:00Z       True         1                            False
0  2024-02-13 15:43:55.228986218+00:00    2024-02-13 15:43:53.367964+00:00      closed           False  in9GwvOpvpozgc88dV     False  ...  False  dc757d24-00d8-4e11-80a6-93a759b44ba8  2024-02-13T15:43:55Z      False         1                            False
1  2024-02-13 15:44:00.450780051+00:00 2024-02-13 15:44:00.326724599+00:00                       False                         False  ...  False  4a24e8a7-3a3c-41e2-a1f2-96620cbe9d5f  2024-02-13T15:44:00Z       True         1                            False

[165 rows x 31 columns]

@Rambatino Rambatino force-pushed the to_pandas branch 3 times, most recently from d0100b9 to 02cd303 Compare February 13, 2024 16:17
Copy link
Contributor

@schehata schehata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super noice, if you can add the df() to the README file as well, it would be perfect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants