# Centralizing Data with Python Classes  

This notebook demonstrates how to use Python classes to centralize and manage data efficiently. These tools are useful for:  

- **Validating** input tables.  
- **Building pipelines** for new data flows.  
- **Accessing data** from sources outside official environments (e.g., Data Lakes).  

You can adapt these classes—or create your own—to integrate with systems like Data Lakes for ad-hoc analysis.  

---

## Data Return Formats  
✔ **All classes except `MarketingCloud` return Pandas DataFrames.**  
✔ **`MarketingCloud.get_data_extension_by_external_key()`** saves a `.parquet` file acording to the `file_path='.\path'` paramether. Load it with:  
```python
import pandas as pd  
df = pd.read_parquet(".\path\your_path.parquet")  

# Imports handlers and requirements

In [None]:
# Update login_params with credentials before running this.
# Manage Classes in Handler.py file
from Handlers import *

# Uses classes

## Google Sheet

In [19]:
gsheet_id='xxx' #or file link. Make sure you have shared the file/folder with your client_email (from google credentials json)
gs = GSheet(
        gsheet_id = gsheet_id, # for multiple files, instancietes different objects (gs1, gs2, ...)
        credentials= 'credentials.json' #https://developers.google.com/workspace/guides/create-credentials
)

# E.g. data:
cars_df = pd.DataFrame({
    'make': ['Toyota', 'Honda', 'Ford', 'Tesla', 'BMW', 'Audi', 'Mercedes', 'Nissan', 'Chevrolet', 'Hyundai'],
    'model': ['Camry', 'Civic', 'F-150', 'Model 3', 'X5', 'A4', 'C-Class', 'Altima', 'Silverado', 'Tucson'],
    'year': [2022, 2021, 2023, 2023, 2022, 2021, 2023, 2022, 2023, 2021],
    'price': [28000, 22000, 35000, 48000, 62000, 42000, 45000, 26000, 38000, 27000]
})

sheet_name='Sheet1' 
gs.load(data=cars_df, sheet_name=sheet_name) #loads df into gsheet
df_teste = gs.get(sheet_name=sheet_name) #reads gsheet data
gs.clear(sheet_name=sheet_name) # erases data from gsheet

df_teste.head(3)

Unnamed: 0,make,model,year,price
0,Toyota,Camry,2022,28000
1,Honda,Civic,2021,22000
2,Ford,F-150,2023,35000


## Hotmart

In [None]:
# Loads Transactions (dt_start >= x <= dt_end)
dt_start, dt_end = '2025-02-15 03:00:00', '2025-02-16 03:00:00'
formato = '%Y-%m-%d %H:%M:%S' # Format String from pandas

hm.get_sales_hm(dt_start, dt_end, formato)

## Marketing Cloud handler

In [None]:
# https://www.youtube.com/watch?v=qxJolioZr3M
mc.get_data_extension_by_external_key(
    external_key='xxxx', # External Key from Data Extraction
    max_page=5 # If None, returns all rows!
)

## Redshift Handler

In [None]:
rs.get_query("""
SELECT 
    * 
FROM 
    schema.table
""")

## Salesforce Handler

In [None]:
# https://developer.salesforce.com/docs/atlas.en-us.soql_sosl.meta/soql_sosl/sforce_api_calls_soql.htm
sf.get_query("""
    SELECT 
        FIELDS(ALL) 
    FROM 
        Account
    LIMIT 1
            """
            )