# Demo of basic use of the stats_can package

This notebook is meant to provide an overview of the basic functionality of the stats_can package

## Getting set up

Most of the functionality you would want is available through the StatsCan class. You can import and instantiate it from the base of the stats_can package:

In [None]:
from stats_can import StatsCan
sc = StatsCan()

Note that by default the StatsCan object will store all data it retrieves in a file in your working directory. If you'd like it saved elsewhere you can provide that path as either a string or ```pathlib.Path``` as an argument at instantiation.

## Primary use case - loading a table

The most common thing you'll want to do with this package is retrieve data from Statistics Canada. We can do that now, using [this table](https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=1410035301) measuring homeless shelter capacity as an example.

In [None]:
shelter_df = sc.table_to_df("14-10-0353-01")

In [None]:
shelter_df.head()

In [None]:
shelter_df.dtypes

If the table has not been previously retrieved it will first download it, extract it, and load it into a dataframe. If it's been retrieved in the past it will load the previously processed table. From above you can see that the library parses dates to datetime columns and sets categorical columns as categorical variables.

## Updating existing tables

If you've retrieved a table at an earlier point and an updated release is available, you can have the package update your local copy:

In [None]:
sc.update_tables()

You can also provide a list of tables to the function if you only want to update a subset of them. In addition to updating the tables the function will return a list of all tables that were updated. In this case since I just downloaded the table prior to calling the update function it returns an empty list, since no tables were updated.

## Pulling in just one vector

Tables are made up of many vectors, and if you only want a particular one for a particular period it would be pretty slow to have to download the entire table (which can be quite large).

In [None]:
sc.vectors_to_df_remote("v113743823", periods=5)

This method will always retrieve the latest data directly from Statistics Canada. The first argument is either a single vector or a list of vectors, and the second argument is the number of periods (months/years depending on the series) to retrieve.