# Using Tulips

This tutorial covers the core data structures of the Tulip library: **TulipSeries** and **TulipCollection**. These classes provide a powerful framework for handling financial and economic time series data with rich metadata and analytical capabilities.

## Table of Contents
1. [TulipSeries: Enhanced Time Series](#tulipseries)
2. [TulipCollection: Multi-Series Container](#tulipcollection)
3. [Creating Collections from Data Sources](#data-sources)
4. [Dashboard Generation](#dashboards)
5. [Data Persistence](#persistence)
6. [Advanced Usage](#advanced)

---

## TulipSeries: Enhanced Time Series {#tulipseries}

**TulipSeries** is a thin wrapper around `pandas.Series` that adds domain-specific metadata and helper methods for financial/economic analysis.

### Core Features

- **Time-indexed data**: Inherits all pandas.Series functionality
- **Rich metadata**: Stores frequency, source information, units, dates
- **Domain-specific methods**: `summary()`, `plot()`, and analytical helpers
- **Automatic metadata tracking**: Last updated timestamps, data provenance

### Key Attributes

```python
# Core data structure
series.time_series    # The underlying pandas Series
series.id            # Unique identifier (e.g., "UNRATE")
series.title         # Human-readable name (e.g., "Unemployment Rate")
series.last_updated  # When data was last refreshed
series.info          # Metadata container (units, dates, source, etc.)
```

### Essential Methods

- **`summary()`**: Statistical overview with latest values, changes, Z-scores
- **`plot()`**: Quick visualization using internal plotting functions
- **Standard pandas operations**: All Series methods work normally

---

## TulipCollection: Multi-Series Container {#tulipcollection}

**TulipCollection** bundles multiple TulipSeries objects for batch operations and comparative analysis.

### Primary Use Cases

1. **Dashboard generation**: Create summary tables and plots for related indicators
2. **Batch analysis**: Apply operations across multiple time series
3. **Data pipelines**: Feed collections into analytical workflows
4. **Comparative studies**: Analyze relationships between economic indicators

### Key Methods

- **`dashboard.table()`**: Formatted summary table for all series
- **`dashboard.plots()`**: Multi-series visualizations with customization
- **`save()`** / **`load()`**: Persistence to/from pickle files
- **Index access**: `collection[i]` to access individual series

In [1]:
# Import necessary modules
from tulip.core.collection import TulipCollection
from tulip.data.fred import FredClient

# Display all outputs in cells
from IPython.core.interactiveshell import InteractiveShell

InteractiveShell.ast_node_interactivity = "all"

In [2]:
# Create a sample collection using FRED data
# BOGMBASE = Monetary Base, BOGMBBM = Money Supply M1, CPIM = Consumer Price Index
sample_collection = FredClient.create_collection(
    codes=["BOGMBASE", "BOGMBBM", "CPIAUCSL"]
)

print("Collection created successfully!")
print(f"Number of series in collection: {len(sample_collection)}")
print(f"Collection type: {type(sample_collection)}")

# Examine the first series
first_series = sample_collection[0]
print(f"\nFirst series details:")
print(f"ID: {first_series.id}")
print(f"Title: {first_series.info.title}")
print(f"Last updated: {first_series.last_updated}")
print(f"Data points: {len(first_series.ts)}")
print(f"Date range: {first_series.ts.index.min()} to {first_series.ts.index.max()}")

Collection created successfully!
Number of series in collection: 3
Collection type: <class 'tulip.core.collection.TulipCollection'>

First series details:
ID: BOGMBASE
Title: Monetary Base: Total
Last updated: 2025-11-12 19:46:19.957648
Data points: 801
Date range: 1959-01-31 00:00:00 to 2025-09-30 00:00:00


## Creating Collections from Data Sources {#data-sources}

TulipCollections are typically created through data client classes that automatically handle data retrieval and TulipSeries creation.

### FRED Data Client

The Federal Reserve Economic Data (FRED) client provides access to thousands of economic time series."

In [3]:
sample_collection.dashboard.table()

Unnamed: 0,Last Value,Last Date,Previous Value,Change Since Previous,Change Since Previous Z,Change 6M,Change 12M,Updated
Monetary Base: Total,5478.0,2025-09-30,5686.0,-208.1,-3.4,-135.3,-109.8,2025-11-12 19:46:00
Monetary Base: Reserve Balances,3068.0,2025-09-30,3282.0,-213.8,-3.6,-193.1,-168.7,2025-11-12 19:46:00
Consumer Price Index for All Urban Consumers: All Items in U.S. City Average,324.4,2025-09-30,323.4,1.0,2.1,4.6,9.5,2025-11-12 19:46:00


## Dashboard Generation {#dashboards}

TulipCollection's dashboard capabilities provide powerful tools for creating summary tables and visualizations.

### Dashboard Table

The `dashboard.table()` method creates a formatted summary showing key statistics for all series in the collection."

In [4]:
# Customize series titles for better display
sample_collection[0].info.title = "Monetary Base (Billions $)"
sample_collection[1].info.title = "Money Supply M1 (Billions $)"
sample_collection[2].info.title = "Consumer Price Index"

# Generate dashboard table
print("Dashboard Table for Economic Indicators:")
dashboard_table = sample_collection.dashboard.table()
dashboard_table

Dashboard Table for Economic Indicators:


Unnamed: 0,Last Value,Last Date,Previous Value,Change Since Previous,Change Since Previous Z,Change 6M,Change 12M,Updated
Monetary Base (Billions $),5478.0,2025-09-30,5686.0,-208.1,-3.4,-135.3,-109.8,2025-11-12 19:46:00
Money Supply M1 (Billions $),3068.0,2025-09-30,3282.0,-213.8,-3.6,-193.1,-168.7,2025-11-12 19:46:00
Consumer Price Index,324.4,2025-09-30,323.4,1.0,2.1,4.6,9.5,2025-11-12 19:46:00


### Working with Individual TulipSeries

Let's explore the capabilities of individual TulipSeries objects within our collection."

In [5]:
# Access individual series from the collection
monetary_base = sample_collection[0]  # BOGMBASE
money_supply = sample_collection[1]  # BOGMBBM
cpi = sample_collection[2]  # CPIM

print("=== TulipSeries Attributes ===")
print(f"Monetary Base ID: {monetary_base.id}")
print(f"Monetary Base Title: {monetary_base.info.title}")
print(f"Units: {getattr(monetary_base.info, 'units', 'Not specified')}")
print(f"Frequency: {getattr(monetary_base.info, 'frequency', 'Not specified')}")

print(f"\n=== Latest Data Points ===")
print(f"Monetary Base (latest): {monetary_base.ts.iloc[-1]:,.0f}")
print(f"Money Supply M1 (latest): {money_supply.ts.iloc[-1]:,.0f}")
print(f"CPI (latest): {cpi.ts.iloc[-1]:.1f}")

print(f"\n=== Data Access (pandas compatibility) ===")
print("Last 3 CPI values:")
print(cpi.ts.tail(3))

=== TulipSeries Attributes ===
Monetary Base ID: BOGMBASE
Monetary Base Title: Monetary Base (Billions $)
Units: Not specified
Frequency: Not specified

=== Latest Data Points ===
Monetary Base (latest): 5,478
Money Supply M1 (latest): 3,068
CPI (latest): 324.4

=== Data Access (pandas compatibility) ===
Last 3 CPI values:
date
2025-07-31    322.132
2025-08-31    323.364
2025-09-30    324.368
Freq: ME, Name: CPIAUCSL, dtype: float64


### TulipSeries Summary Method

The `summary()` method provides key statistics and insights for a time series."

In [6]:
# Demonstrate the summary() method
print("=== CPI Summary ===")
try:
    cpi_summary = cpi.summary()
    print(cpi_summary)
except Exception as e:
    print(f"Summary method error: {e}")
    print("Computing manual summary statistics:")

    # Manual summary statistics
    latest_value = cpi.ts.iloc[-1]
    prev_month = cpi.ts.iloc[-2] if len(cpi.ts) > 1 else latest_value
    prev_year = cpi.ts.iloc[-13] if len(cpi.ts) > 12 else latest_value

    monthly_change = ((latest_value / prev_month) - 1) * 100
    annual_change = ((latest_value / prev_year) - 1) * 100

    print(f"Latest Value: {latest_value:.2f}")
    print(f"Monthly Change: {monthly_change:.2f}%")
    print(f"Annual Change: {annual_change:.2f}%")
    print(f"Latest Date: {cpi.ts.index[-1]}")

=== CPI Summary ===
Summary method error: 'Series' object is not callable
Computing manual summary statistics:
Latest Value: 324.37
Monthly Change: 0.31%
Annual Change: 3.02%
Latest Date: 2025-09-30 00:00:00


You can save this object to a file for later use:

In [7]:
sample_collection.save("sample_collection.pickle")

## Data Persistence {#persistence}

TulipCollections can be saved and loaded using pickle format for easy data persistence and sharing."

### Dashboard Plots

The `dashboard.plots()` method creates multi-series visualizations with various customization options."

In [8]:
# Create plots with customization options
print("Generating dashboard plots...")

# Basic plots
try:
    plots = sample_collection.dashboard.plots()
    plots
except Exception as e:
    print(f"Error generating plots: {e}")
    print("This may be due to environment limitations or missing plot configuration.")

Generating dashboard plots...


In [9]:
load_back = TulipCollection.load("sample_collection.pickle")

In [10]:
# Verify the loaded collection
print("=== Loaded Collection Verification ===")
print(f"Original collection length: {len(sample_collection)}")
print(f"Loaded collection length: {len(load_back)}")
print(f"Data integrity check: {load_back[0].id == sample_collection[0].id}")

# Compare a few data points
original_latest = sample_collection[0].ts.iloc[-1]
loaded_latest = load_back[0].ts.iloc[-1]
print(f"Data values match: {original_latest == loaded_latest}")

print(f"\nLoaded collection series IDs:")
for i, series in enumerate(load_back):
    print(f"  {i}: {series.id} - {series.info.title}")

=== Loaded Collection Verification ===
Original collection length: 3
Loaded collection length: 3
Data integrity check: True
Data values match: True

Loaded collection series IDs:
  0: BOGMBASE - Monetary Base (Billions $)
  1: BOGMBBM - Money Supply M1 (Billions $)
  2: CPIAUCSL - Consumer Price Index


In [11]:
load_back.dashboard.plots()

## Advanced Usage {#advanced}

Here are some advanced techniques for working with TulipSeries and TulipCollections."

In [12]:
# Advanced Usage Examples

# 1. Convert collection to pandas DataFrame
print("=== Converting to DataFrame ===")
try:
    df = load_back.ts  # .ts property converts to DataFrame
    print(f"DataFrame shape: {df.shape}")
    print(f"DataFrame columns: {list(df.columns)}")
    print("First few rows:")
    print(df.head())
except Exception as e:
    print(f"DataFrame conversion error: {e}")

# 2. Customize metadata for better analysis
print(f"\n=== Metadata Customization ===")
cpi_series = load_back[2]
print(f"Original title: {cpi_series.info.title}")

# Customize metadata
cpi_series.info.quote_units = "%"
if hasattr(cpi_series.info, "frequency"):
    print(f"Frequency: {cpi_series.info.frequency}")

# 3. Mathematical operations (pandas compatibility)
print(f"\n=== Mathematical Operations ===")
monetary_series = load_back[0]
print(f"Latest value: ${monetary_series.ts.iloc[-1]:,.0f} billion")

# Calculate year-over-year growth
if len(monetary_series.ts) > 12:
    yoy_growth = (monetary_series.ts.iloc[-1] / monetary_series.ts.iloc[-13] - 1) * 100
    print(f"Year-over-year growth: {yoy_growth:.1f}%")

# Calculate 6-month moving average
ma_6m = monetary_series.ts.rolling(6).mean()
print(f"6-month MA (latest): ${ma_6m.iloc[-1]:,.0f} billion")

=== Converting to DataFrame ===
DataFrame shape: (945, 3)
DataFrame columns: ['BOGMBASE', 'BOGMBBM', 'CPIAUCSL']
First few rows:
            BOGMBASE  BOGMBBM  CPIAUCSL
date                                   
1947-01-31       NaN      NaN     21.48
1947-02-28       NaN      NaN     21.62
1947-03-31       NaN      NaN     22.00
1947-04-30       NaN      NaN     22.00
1947-05-31       NaN      NaN     21.95

=== Metadata Customization ===
Original title: Consumer Price Index

=== Mathematical Operations ===
Latest value: $5,478 billion
Year-over-year growth: -2.0%
6-month MA (latest): $5,672 billion


### Dashboard Plot Customization

The `dashboard.plots()` method supports various parameters for customizing visualizations:

```python
# Common plot customization options:
collection.dashboard.plots(
    hlines=50,              # Add horizontal reference lines (e.g., PMI 50 line)
    years_limit=3,          # Limit time range to last 3 years
    mma=12,                 # Add 12-period moving average
    tick_suffix='%',        # Add percentage suffix to y-axis
    show_0=True            # Include zero line in plots
)
```"

### Creating Collections from Multiple Data Sources

You can create collections from different data providers:

```python
# Bloomberg example
from tulip.data.bloomberg import BloombergClient
bb = BloombergClient()
pmi_collection = bb.create_collection(['ISM PRCM Index', 'NAPMPMI Index'])

# Haver example  
from tulip.data.haver import HaverClient
hv = HaverClient()
employment_collection = hv.create_collection(['LRMANUA@USECON', 'CECIINJC@USECON'])

# Manual collection creation
from tulip.data.collections.collections import TulipSeries, TulipCollection
manual_series_list = [series1, series2, series3]  # Your TulipSeries objects
manual_collection = TulipCollection(manual_series_list)
```"

### Good/Bad Value Interpretation

Collections can specify whether higher values are good or bad for interpretation:

```python
# Set interpretation for specific series (1 = good, -1 = bad)
collection.good_is['INJCJC   Index'] = -1  # Unemployment claims (higher = worse)
collection.good_is['GDP Index'] = 1         # GDP growth (higher = better)
```"

## Summary

This tutorial covered the essential aspects of TulipSeries and TulipCollection:

### Key Takeaways

1. **TulipSeries** enhances pandas.Series with financial metadata and domain-specific methods
2. **TulipCollection** enables batch operations and dashboard generation for multiple series
3. **Data clients** (FRED, Bloomberg, Haver) provide easy collection creation
4. **Dashboard tools** generate summary tables and customizable visualizations
5. **Persistence** allows saving/loading collections for data sharing and workflow continuity
6. **Full pandas compatibility** ensures seamless integration with existing data analysis workflows

### Best Practices

- Use descriptive titles for better dashboard readability
- Leverage the `good_is` attribute for proper value interpretation
- Take advantage of persistence for expensive data operations
- Customize dashboard plots based on data characteristics (reference lines, time ranges, etc.)
- Use the `.ts` property when you need raw pandas DataFrame functionality

The Tulip library provides a robust foundation for financial and economic data analysis, combining the flexibility of pandas with domain-specific enhancements for professional financial research workflows."