# Series Collection

Often users of the FRED API will want analyze multiple economic series. This can be done with `FredSeries` alone, but can be tedious and cumbersome. `pyfredapi` offers the `SeriesCollection` class to streamline the process of collecting and munging data for plotting and analysis.

A `SeriesCollection` object is a set of `SeriesData` objects. `SeriesCollection` provide helpful methods to:

* List the metadata (frequency, seasonality, units, etc.) of the series in the collection
* Merge series dataframes into a long dataframe
* Merge series dataframes into a wide dataframe by index
* Merge series dataframes into a wide dataframe by date

## Setup

Import pyfredapi

In [1]:
import pyfredapi as pf
from rich.pretty import pprint

## Create a SeriesCollection

Create an instance of `SeriesCollection`,

Add data to the collection with `add_series()`. By default the column for the series values will be renamed to the series id.

In [2]:
sc = pf.SeriesCollection(series_id=["GDP"])

Requesting series GDP...


## Collect additional series

Add more series to a `SeriesCollection` object with `add()`.

In [3]:
sc.add(series_id=["SP500"])

Requesting series SP500...


### Remove series

Remove series from the collection with `remove()`.

In [4]:
sc.remove("SP500")

Removed series SP500


## Plot Series

The `plot` method builds a [plotly](https://plotly.com/python/) time series plot of the data.

In [5]:
fig = sc.GDP.plot()
# choose the render appropriate for your environment
fig.show(renderer="sphinx_gallery")

## Accessing the data

The `SeriesCollection` is composed of `SeriesData` objects. You can access the `SeriesData` by attribute. Each series_id added to the collection will be an attribute that returns the `SeriesData` object for that series.

`SeriesData` is has two attributes.

* `info` - The series metadata.
* `df` - Series observations in a pandas dataframe.

### Access via attribute

In [6]:
sc.GDP == sc["GDP"]

True

In [7]:
pprint(sc.GDP.info)

In [8]:
sc.GDP.df.tail()

Unnamed: 0,date,GDP
306,2022-07-01,25994.639
307,2022-10-01,26408.405
308,2023-01-01,26813.601
309,2023-04-01,27063.012
310,2023-07-01,27623.543


## Access via bracket notation

In [9]:
pprint(sc["GDP"].info)

## Rename series in the collection

### Rename on add

You can rename the series when adding them to the collection. Renaming can be done with a dictionary mapping the series id to the new name, or with a function which parses the series title into the new name.

In [10]:
# Rename with a dictionary
new_names = {
    "CPIAUCSL": "cpi_all_items",
    "CPILFESL": "cpi_all_items_less_food_and_energy",
}

cpi_sc = pf.SeriesCollection(series_id=["CPIAUCSL", "CPILFESL"], rename=new_names)

Requesting series CPIAUCSL...
Requesting series CPILFESL...


In [11]:
cpi_sc.CPIAUCSL.df.head()

Unnamed: 0,date,cpi_all_items
0,1947-01-01,21.48
1,1947-02-01,21.62
2,1947-03-01,22.0
3,1947-04-01,22.0
4,1947-05-01,21.95


In [12]:
cpi_sc.CPILFESL.df.head()

Unnamed: 0,date,cpi_all_items_less_food_and_energy
0,1957-01-01,28.5
1,1957-02-01,28.6
2,1957-03-01,28.7
3,1957-04-01,28.8
4,1957-05-01,28.8


### Rename after add

You can rename series in the collection with the `rename_series` method. Works the same way as renaming on add.

In [13]:
def parse_cpi_title(title: str) -> str:
    """Parse CPI series title into a readable label."""
    return (
        title.lower()
        .replace("consumer price index", "CPI ")
        .replace(" for all urban consumers: ", "")
        .replace(" in u.s. city average", "")
        .title()
    )


cpi_sc.rename_series(rename=parse_cpi_title)

In [14]:
cpi_sc.CPIAUCSL.df.head()

Unnamed: 0,date,Cpi All Items
0,1947-01-01,21.48
1,1947-02-01,21.62
2,1947-03-01,22.0
3,1947-04-01,22.0
4,1947-05-01,21.95


## List metadata

`SeriesCollection` has a number of list methods to print out the metadata of the series in the collection.

### Series in the collection

In [15]:
cpi_sc.list_series()

### Frequency

In [16]:
cpi_sc.list_frequency()

All series are Monthly


### Seasonality

In [17]:
cpi_sc.list_seasonality()

All series are Seasonally Adjusted


### Units

In [18]:
cpi_sc.list_units()

All series are that are measured in Index 1982-1984=100


### Dates

In [19]:
cpi_sc.list_end_date()

All series end on 2023-09-01


In [20]:
cpi_sc.list_start_date()

## Merge data

`SeriesCollection` supports merging the data into long and wide formats. By default the series ID will be used as the column name or observation label.

### Merge long

Merge the series in the collection into a long pandas dataframe.

In [21]:
cpi_long = cpi_sc.merge_long()
cpi_long

Unnamed: 0,date,value,series
0,1947-01-01,21.480,Cpi All Items
1,1947-02-01,21.620,Cpi All Items
2,1947-03-01,22.000,Cpi All Items
3,1947-04-01,22.000,Cpi All Items
4,1947-05-01,21.950,Cpi All Items
...,...,...,...
1717,2023-05-01,307.824,Cpi All Items Less Food And Energy
1718,2023-06-01,308.309,Cpi All Items Less Food And Energy
1719,2023-07-01,308.801,Cpi All Items Less Food And Energy
1720,2023-08-01,309.661,Cpi All Items Less Food And Energy


### Merge as-of

Merge the series in the collection into a wide pandas dataframe based on nearest date. Must define a base series. The base series defines the set of dates to serve of the basis of joining.

In [22]:
cpi_asof = cpi_sc.merge_asof(base_series_id="CPIAUCSL")
cpi_asof.tail()

Unnamed: 0,date,Cpi All Items,Cpi All Items Less Food And Energy
916,2023-05-01,303.294,307.824
917,2023-06-01,303.841,308.309
918,2023-07-01,304.348,308.801
919,2023-08-01,306.269,309.661
920,2023-09-01,307.481,310.661


### Merge wide

Merge the series in the collection into a wide pandas dataframe. Only works if all the series in the collection share the same date index.

In [23]:
cpi_wide = cpi_sc.merge_wide()
cpi_wide.tail()

Unnamed: 0,date,Cpi All Items,Cpi All Items Less Food And Energy
916,2023-05-01,303.294,307.824
917,2023-06-01,303.841,308.309
918,2023-07-01,304.348,308.801
919,2023-08-01,306.269,309.661
920,2023-09-01,307.481,310.661
