# QDL Tutorial

This notebook demonstrates loading factor data, reshaping to wide format, and selecting specific columns via the facade. We'll:
- Import modules
- Load USA VW factors (dataset="factor") and project columns ["date","name","ret"]
- Preview raw data and wide form
- Load characteristics (JKP) with strict column selection ["permno","me","be_me"]


In [1]:
from qdl import dataloader, transformer

# Load USA (vw) factors from the long-form factor dataset and project required columns
df = dataloader.load_factors(country="usa", dataset="factor", weighting="vw")[
    ["date", "name", "ret"]
]
print("Raw factors head (projected):")
print(df.head(5))


Raw factors head (projected):
        date name       ret
0 1926-03-31  age -0.098837
1 1926-04-30  age -0.006303
2 1926-05-31  age -0.000190
3 1926-06-30  age -0.008323
4 1926-07-31  age -0.004220


In [2]:
# Pivot to wide format (date index, factor names as columns)
wide = transformer.to_wide_factors(df)
print("\nWide factors head:")
print(wide.head(5))



Wide factors head:
                 age  aliq_at  aliq_mat  ami_126d  at_be  at_gr1  at_me  \
date                                                                      
1926-01-31       NaN      NaN       NaN       NaN    NaN     NaN    NaN   
1926-02-28       NaN      NaN       NaN       NaN    NaN     NaN    NaN   
1926-03-31 -0.098837      NaN       NaN       NaN    NaN     NaN    NaN   
1926-04-30 -0.006303      NaN       NaN -0.003724    NaN     NaN    NaN   
1926-05-31 -0.000190      NaN       NaN -0.005195    NaN     NaN    NaN   

            at_turnover  be_gr1a  be_me  ...  taccruals_at  taccruals_ni  \
date                                     ...                               
1926-01-31          NaN      NaN    NaN  ...           NaN           NaN   
1926-02-28          NaN      NaN    NaN  ...           NaN           NaN   
1926-03-31          NaN      NaN    NaN  ...           NaN           NaN   
1926-04-30          NaN      NaN    NaN  ...           NaN           NaN  

In [3]:
# Using the Facade interface with column selection
from qdl.facade import QDL

q = QDL()
df2 = q.load_factors(
    country="usa",
    dataset="factor",
    weighting="vw",
    columns=["date", "name", "ret"],
)
wide2 = transformer.to_wide_factors(df2)
print("\nFacade wide factors head:")
print(wide2.head(5))



Facade wide factors head:
                 age  aliq_at  aliq_mat  ami_126d  at_be  at_gr1  at_me  \
date                                                                      
1926-01-31       NaN      NaN       NaN       NaN    NaN     NaN    NaN   
1926-02-28       NaN      NaN       NaN       NaN    NaN     NaN    NaN   
1926-03-31 -0.098837      NaN       NaN       NaN    NaN     NaN    NaN   
1926-04-30 -0.006303      NaN       NaN -0.003724    NaN     NaN    NaN   
1926-05-31 -0.000190      NaN       NaN -0.005195    NaN     NaN    NaN   

            at_turnover  be_gr1a  be_me  ...  taccruals_at  taccruals_ni  \
date                                     ...                               
1926-01-31          NaN      NaN    NaN  ...           NaN           NaN   
1926-02-28          NaN      NaN    NaN  ...           NaN           NaN   
1926-03-31          NaN      NaN    NaN  ...           NaN           NaN   
1926-04-30          NaN      NaN    NaN  ...           NaN         

In [4]:
# Simple validation: align by index and compare column sets (like test_qdl.py)
# Simulate differing periods between user and reference series
user_wide = wide2.iloc[12:].copy()
ref_wide = wide2.iloc[:-6].copy()

common_idx = user_wide.index.intersection(ref_wide.index)
user_aligned = user_wide.loc[common_idx, :]
ref_aligned = ref_wide.loc[common_idx, :]

print("\nValidation (index inner-join) period:", common_idx.min(), "to", common_idx.max())
print("Aligned length:", len(common_idx))
print("Columns equal (set):", set(user_aligned.columns) == set(ref_aligned.columns))




Validation (index inner-join) period: 1927-01-31 00:00:00 to 2024-06-30 00:00:00
Aligned length: 1170
Columns equal (set): True


In [5]:
# Load characteristics (JKP) with strict column selection
from qdl.facade import QDL

q = QDL()
chars = q.load_chars(
    country="usa",
    vintage="2020-",
    columns=["permno", "me", "be_me"],
)
print("\nChars head (strict columns):")
print(chars.head(5))



Chars head (strict columns):
    permno          me  be_me       date       id
0  20964.0  371.249986    NaN 2021-04-30  20964.0
1  20964.0  338.100007    NaN 2021-05-28  20964.0
2  20964.0  336.375000    NaN 2021-06-30  20964.0
3  20964.0  337.409991    NaN 2021-07-30  20964.0
4  20964.0  336.375000    NaN 2021-08-31  20964.0
