# Series Collection

Often users of the FRED API will want analyze multiple economic series. This can be done with `FredSeries` alone, but can be tedious and cumbersome. `pyfredapi` offers the `SeriesCollection` class to streamline the process of collecting and munging the data to get ready for plotting and analysis.

A `SeriesCollection` object is a set of `SeriesData` objects. `SeriesCollection` provide helpful methods to:

* List the metadata (frequency, seasonality, units, etc.) of the series in the collection
* Merge series dataframes into a long dataframe
* Merge series dataframes into a wide dataframe by index
* Merge series dataframes into a wide dataframe by date

## Setup

Import and create an instance of `SeriesCollection`

In [1]:
from pyfredapi import SeriesCollection
from rich import print as rprint

cpi_sc = SeriesCollection()

## Collect data

Add data to the collection with `add_series()`. By default the column for the series values will be renamed to the series id.

In [2]:
cpi_series = ["CPIAUCSL", "CPILFESL"]
cpi_sc.add_series(series_ids=cpi_series)

Requesting series CPIAUCSL...
Requesting series CPILFESL...


### Remove series

Remove series to the collection with `drop_series()`.

In [3]:
cpi_sc.drop_series("CPILFESL")

Removed series CPILFESL


## Accessing the data

The `SeriesCollection` is composed of `SeriesInfo` objects. You can access the series either by attribute or by the `data` dictionary.

### Access via attribute

In [4]:
type(cpi_sc.CPIAUCSL)

pyfredapi.api.series.SeriesData

In [5]:
rprint(cpi_sc.CPIAUCSL.info)

In [6]:
cpi_sc.CPIAUCSL.data.head()

Unnamed: 0,date,CPIAUCSL
0,1947-01-01,21.48
1,1947-02-01,21.62
2,1947-03-01,22.0
3,1947-04-01,22.0
4,1947-05-01,21.95


### Access via dictionary

In [7]:
cpi_sc.data["CPIAUCSL"].data.head()

Unnamed: 0,date,CPIAUCSL
0,1947-01-01,21.48
1,1947-02-01,21.62
2,1947-03-01,22.0
3,1947-04-01,22.0
4,1947-05-01,21.95


## Rename series in the collection

### Rename on add

You can rename the series when adding them to the collection. Renaming can be done with a dictionary mapping the series id to the new name, or with a function which parses the series title into the new name.

In [8]:
# Rename with a dictionary
new_names = {
    "CPIAUCSL": "cpi_all_items",
    "CPILFESL": "cpi_all_items_less_food_and_energy",
}

cpi_sc = SeriesCollection()
cpi_sc.add_series(series_ids=cpi_series, rename=new_names)

Requesting series CPIAUCSL...
Requesting series CPILFESL...


In [9]:
cpi_sc.CPIAUCSL.data.head()

Unnamed: 0,date,cpi_all_items
0,1947-01-01,21.48
1,1947-02-01,21.62
2,1947-03-01,22.0
3,1947-04-01,22.0
4,1947-05-01,21.95


In [10]:
cpi_sc.CPILFESL.data.head()

Unnamed: 0,date,cpi_all_items_less_food_and_energy
0,1957-01-01,28.5
1,1957-02-01,28.6
2,1957-03-01,28.7
3,1957-04-01,28.8
4,1957-05-01,28.8


### Rename after add

You can rename series in the collection with the `rename_series` method. Works the same way as renaming on add.

In [11]:
def parse_cpi_title(title: str) -> str:
    """Parse CPI series title into a readable label."""
    return (
        title.lower()
        .replace("consumer price index", "cpi ")
        .replace(" for all urban consumers: ", "")
        .replace(" in u.s. city average", "")
        .replace(" ", "_")
    )


cpi_sc.rename_series(rename=parse_cpi_title)

In [12]:
cpi_sc.CPIAUCSL.data.head()

Unnamed: 0,date,cpi_all_items
0,1947-01-01,21.48
1,1947-02-01,21.62
2,1947-03-01,22.0
3,1947-04-01,22.0
4,1947-05-01,21.95


## List metadata

`SeriesCollection` has a number of list methods to print out the metadata of the series in the collection.

### Series in the collection

In [13]:
cpi_sc.list_series()

### Frequency

In [14]:
cpi_sc.list_frequency()

All series are Monthly


### Seasonality

In [15]:
cpi_sc.list_seasonality()

All series are Seasonally Adjusted


### Units

In [16]:
cpi_sc.list_units()

All series are that are measured in Index 1982-1984=100


### Dates

In [17]:
cpi_sc.list_end_date()

All series end on 2022-09-01


In [18]:
cpi_sc.list_start_date()

## Merge data

`SeriesCollection` supports merging the data into long and wide formats. By default the series ID will be used as the column name or observation label.

### Merge long

Merge the series in the collection into a long pandas dataframe.

In [19]:
cpi_long = cpi_sc.merge_long()
cpi_long

Unnamed: 0,date,value,series
0,1947-01-01,21.480,cpi_all_items
1,1947-02-01,21.620,cpi_all_items
2,1947-03-01,22.000,cpi_all_items
3,1947-04-01,22.000,cpi_all_items
4,1947-05-01,21.950,cpi_all_items
...,...,...,...
1693,2022-05-01,292.289,cpi_all_items_less_food_and_energy
1694,2022-06-01,294.354,cpi_all_items_less_food_and_energy
1695,2022-07-01,295.275,cpi_all_items_less_food_and_energy
1696,2022-08-01,296.950,cpi_all_items_less_food_and_energy


### Merge as-of

Merge the series in the collection into a wide pandas dataframe based on nearest date. Must define a base series. The base series defines the set of dates to serve of the basis of joining.

In [20]:
cpi_asof = cpi_sc.merge_asof(base_series_id="CPIAUCSL")
cpi_asof.tail()

Unnamed: 0,date,cpi_all_items,cpi_all_items_less_food_and_energy
904,2022-05-01,291.474,292.289
905,2022-06-01,295.328,294.354
906,2022-07-01,295.271,295.275
907,2022-08-01,295.62,296.95
908,2022-09-01,296.761,298.66


### Merge wide

Merge the series in the collection into a wide pandas dataframe. Only works if all the series in the collection share the same date index.

In [21]:
cpi_wide = cpi_sc.merge_wide()
cpi_wide.tail()

Unnamed: 0,date,cpi_all_items,cpi_all_items_less_food_and_energy
904,2022-05-01,291.474,292.289
905,2022-06-01,295.328,294.354
906,2022-07-01,295.271,295.275
907,2022-08-01,295.62,296.95
908,2022-09-01,296.761,298.66
