A python module that takes in data from Google Analytics and joins the appropriate MongoDB SecurEd collections as Polars dataframes for easier data analysis.
The dataframe can be converted to different data types and file formats such as csv, Pandas dataframe, Apache Arrow, etc.
The version of PyMongo used throughout this modules leverages the PyMongoArrow extension to automatically output a polars dataframe.
PyMongoArrow is a PyMongo extension containing tools for loading MongoDB query result sets as Apache Arrow tables, Pandas and NumPy arrays
A Google Analytics account to supply the following environment variables to be written in a .env file:
GOOGLE_SERVICE_ACCOUNT_EMAILGOOGLE_PRIVATE_KEY
Mongo database URI
MONGO_DB_URI
This package isn't currently in PyPi so install via git
uv add git+https://github.com/Cyber4All/secured_python_pipeline.git
pip install git+https://github.com/Cyber4All/secured_python_pipeline.git