<a href="https://colab.research.google.com/github/Aryan-Jhaveri/mcp-statcan/blob/main/JupyterNB.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [3]:
%%bash
git init
git clone https://github.com/Aryan-Jhaveri/mcp-statcan.git


Reinitialized existing Git repository in /content/.git/


Cloning into 'mcp-statcan'...


This Jupyter Notebook serves as a testing playground for learning and interacting with the Statistics Canada API.

# Sections:

1. API Key Management:
    - Securely store and manage your Statistics Canada API key.
    - Explore different methods for key storage (environment variables, configuration files).

 2. Data Discovery and Exploration:
    - Utilize the API to browse available datasets and metadata.
    - Implement search functionalities based on keywords or subject matter.
    - Experiment with filtering and refining search results.
    - Visualize dataset metadata to understand data structure and content.


 3. Data Retrieval:
    - Implement functions to retrieve specific data from chosen datasets.
    - Explore various output formats (CSV, JSON, etc.).
    - Handle pagination for large datasets.
    - Implement error handling for API requests (timeouts, rate limits).

 4. Data Processing and Transformation:
    - Process retrieved data (cleaning, formatting, manipulation).
    - Convert data into suitable formats for further analysis (e.g., Pandas DataFrames).
    - Implement data aggregation and summarization methods.


 5. Data Visualization and Analysis:
    - Visualize data using libraries like Matplotlib, Seaborn, Plotly.
    - Perform exploratory data analysis (EDA).
    - Create visualizations that effectively communicate insights.


 6. Advanced Usage (Optional):
    - Explore more advanced API features (e.g., custom time series).
    - Experiment with creating custom visualizations and dashboards.
    - Consider integrating data with other sources or applications.


 Throughout the notebook, include ample comments explaining code logic, and add markdown cells for documentation.
 Ensure all code cells are executable and outputs are presented in a clear format.
 Make use of error handling and logging to debug potential issues effectively.

# Goal:
 This notebook will provide a clear structure for beginners to explore the Statistics Canada API, starting from the basic usage up to advanced topics.

In [21]:
import requests
import json
from datetime import datetime, timedelta

def get_changed_series_list(api_key=None):
    """
    Retrieves the list of series that have changed today from Statistics Canada API.

    Args:
        api_key (str, optional): Your Statistics Canada API key. If None, it attempts to read it from environment. Defaults to None.

    Returns:
        list: A list of dictionaries representing the changed series. Returns an empty list if an error occurs or no changes found.
    """

    url = "https://www150.statcan.gc.ca/t1/wds/rest/getChangedSeriesList"
    headers = {}
    if api_key:
        headers["Authorization"] = f"Bearer {api_key}"  # or however your API key is used

    try:
      response = requests.get(url, headers=headers)
      response.raise_for_status() # Raise HTTPError for bad responses (4xx or 5xx)

      data = response.json()

      if data["status"] == "SUCCESS":
        return data["object"]
      else:
        print(f"API request failed: {data.get('message', 'Unknown error')}")
        return [] # Return an empty list in case of failure

    except requests.exceptions.RequestException as e:
        print(f"An error occurred during API request: {e}")
        return []

    except (KeyError, json.JSONDecodeError) as e:
      print(f"Error processing API response: {e}")
      return []

# Example usage
if __name__ == "__main__":
  #
  changed_series = get_changed_series_list()

  if changed_series:
    print(f"Number of changed series: {len(changed_series)}")
    # Process changed series data
    for series in changed_series:
      print(json.dumps(series, indent=2))
  else:
      print("No changes found or an error occurred.")

Number of changed series: 7319
{
  "responseStatusCode": 0,
  "vectorId": 111666240,
  "productId": 33100036,
  "coordinate": "1.17.0.0.0.0.0.0.0.0",
  "releaseTime": "2025-04-08T08:30"
}
{
  "responseStatusCode": 0,
  "vectorId": 111666225,
  "productId": 33100036,
  "coordinate": "1.2.0.0.0.0.0.0.0.0",
  "releaseTime": "2025-04-08T08:30"
}
{
  "responseStatusCode": 0,
  "vectorId": 111666247,
  "productId": 33100036,
  "coordinate": "1.24.0.0.0.0.0.0.0.0",
  "releaseTime": "2025-04-08T08:30"
}
{
  "responseStatusCode": 0,
  "vectorId": 39055,
  "productId": 10100139,
  "coordinate": "1.17.0.0.0.0.0.0.0.0",
  "releaseTime": "2025-04-08T08:30"
}
{
  "responseStatusCode": 0,
  "vectorId": 36610,
  "productId": 10100136,
  "coordinate": "1.1.0.0.0.0.0.0.0.0",
  "releaseTime": "2025-04-08T08:30"
}
{
  "responseStatusCode": 0,
  "vectorId": 36624,
  "productId": 10100136,
  "coordinate": "1.15.0.0.0.0.0.0.0.0",
  "releaseTime": "2025-04-08T08:30"
}
{
  "responseStatusCode": 0,
  "vectorId"

# Bash script to Extract Functions

In [None]:
%%bash
# Filter files that have "post body" within them
for file in mcp-statcan/docs/references/api_sections/*.txt; do
  if grep -q "POST BODY" "$file"; then
    # Extract the filename without extension
    filename=$(basename "$file" .txt)

    # Read the second line from the file
    second_line=$(sed -n '2p' "$file")

    # Find the line after GET URL: and remove whitespace
    get_url_line=$(awk '/GET URL:/ {getline; gsub(/^[[:space:]]+|[[:space:]]+$/, "", $0); print}' "$file")

    # Find the line after POST URL: and remove whitespace
    post_url_line=$(awk '/POST URL:/ {getline; gsub(/^[[:space:]]+|[[:space:]]+$/, "", $0); print}' "$file")

    # Print the function definition
    echo "def ${filename}():"
    echo "  \"\"\""
    echo "  $second_line"
    echo "  \"\"\""

    if [ -n "$get_url_line" ]; then
      echo "  get_url = \"$get_url_line\""
    else
      echo "  # GET URL not found in file"
    fi

    if [ -n "$post_url_line" ]; then
      echo "  post_url = \"$post_url_line\""
    else
      echo "  # POST URL not found in file"
    fi

    echo "  return f\"API logic not implemented yet for ${filename}.\""
    echo ""
  fi
done

def getBulkVectorDataByRange():
  """
  For users that require accessing data according to a certain date range, this method allows access by date range and vector.
  """
  # GET URL not found in file
  post_url = "https://www150.statcan.gc.ca/t1/wds/rest/getBulkVectorDataByRange"
  return f"API logic not implemented yet for getBulkVectorDataByRange."

def getChangedSeriesDataFromCubePidCoord():
  """
  POST URL:
  """
  # GET URL not found in file
  post_url = "https://www150.statcan.gc.ca/t1/wds/rest/getChangedSeriesDataFromCubePidCoord"
  return f"API logic not implemented yet for getChangedSeriesDataFromCubePidCoord."

def getChangedSeriesDataFromVector():
  """
  POST URL:
  """
  # GET URL not found in file
  post_url = "https://www150.statcan.gc.ca/t1/wds/rest/getChangedSeriesDataFromVector"
  return f"API logic not implemented yet for getChangedSeriesDataFromVector."

def getCubeMetadata_examples():
  """
  Supplemental Information
  """
  get_url = "https://www150.statcan.gc.c

In [None]:
%%markdown
def getAllCubesListLite():
  """
  Users can query the output database to provide a complete inventory of data tables available through this Statistics Canada API. This command accesses a list of details about each table.  Unlike getAllCubesList, this method does not return dimension or footnote information.
  """
  get_url = "https://www150.statcan.gc.ca/t1/wds/rest/getAllCubesListLite"
  # POST URL not found in file
  return f"API logic not implemented yet for getAllCubesListLite."

def getAllCubesList():
  """
  Users can query the output database to provide a complete inventory of data tables available through this Statistics Canada API. This command accesses a comprehensive list of details about each table including information at the dimension level.
  """
  get_url = "https://www150.statcan.gc.ca/t1/wds/rest/getAllCubesList"
  # POST URL not found in file
  return f"API logic not implemented yet for getAllCubesList."

def getBulkVectorDataByRange():
  """
  For users that require accessing data according to a certain date range, this method allows access by date range and vector.
  """
  # GET URL not found in file
  post_url = "https://www150.statcan.gc.ca/t1/wds/rest/getBulkVectorDataByRange"
  return f"API logic not implemented yet for getBulkVectorDataByRange."

def getChangedCubeList():
  """
  Users can also query what has changed at the table/cube level on a specific day by adding an ISO date to the end of the URL. This date can be any date from today into the past.
  """
  get_url = "https://www150.statcan.gc.ca/t1/wds/rest/getChangedCubeList/2017-12-07"
  # POST URL not found in file
  return f"API logic not implemented yet for getChangedCubeList."

def getChangedSeriesDataFromCubePidCoord():
  """
  POST URL:
  """
  # GET URL not found in file
  post_url = "https://www150.statcan.gc.ca/t1/wds/rest/getChangedSeriesDataFromCubePidCoord"
  return f"API logic not implemented yet for getChangedSeriesDataFromCubePidCoord."

def getChangedSeriesDataFromVector():
  """
  POST URL:
  """
  # GET URL not found in file
  post_url = "https://www150.statcan.gc.ca/t1/wds/rest/getChangedSeriesDataFromVector"
  return f"API logic not implemented yet for getChangedSeriesDataFromVector."

def getChangedSeriesList():
  """
  Users can choose to ask for what series have changed today. This can be invoked at any time of day and will reflect the list of series that have been updated at 8:30am EST on a given release up until midnight that same day.
  """
  get_url = "https://www150.statcan.gc.ca/t1/wds/rest/getChangedSeriesList"
  # POST URL not found in file
  return f"API logic not implemented yet for getChangedSeriesList."

def getCodeSets():
  """
  GET URL:
  """
  get_url = "https://www150.statcan.gc.ca/t1/wds/rest/getCodeSets"
  # POST URL not found in file
  return f"API logic not implemented yet for getCodeSets."

def getCubeMetadata():
  """
  POST URL:
  """
  # GET URL not found in file
  post_url = "https://www150.statcan.gc.ca/t1/wds/rest/getCubeMetadata"
  return f"API logic not implemented yet for getCubeMetadata."

def getDataFromCubePidCoordAndLatestNPeriods():
  """
  For those who are looking to display data going back N reporting periods from today there are the following set of endpoints (methods). Both methods will return the same results. Our example uses the last three (3) reference periods.
  """
  # GET URL not found in file
  post_url = "https://www150.statcan.gc.ca/t1/wds/rest/getDataFromCubePidCoordAndLatestNPeriods"
  return f"API logic not implemented yet for getDataFromCubePidCoordAndLatestNPeriods."

def getDataFromVectorByReferencePeriodRange():
  """
  For users that require accessing data according to a certain reference period range, this method allows access by reference period range and vector.
  """
  get_url = "https://www150.statcan.gc.ca/t1/wds/rest/getDataFromVectorByReferencePeriodRange"
  # POST URL not found in file
  return f"API logic not implemented yet for getDataFromVectorByReferencePeriodRange."

def getDataFromVectorsAndLatestNPeriods():
  """
  POST URL:
  """
  # GET URL not found in file
  post_url = "https://www150.statcan.gc.ca/t1/wds/rest/getDataFromVectorsAndLatestNPeriods"
  return f"API logic not implemented yet for getDataFromVectorsAndLatestNPeriods."

def getFullTableDownloadCSV():
  """
  For users who require the full table/cube of extracted time series, a static file download is available in both CSV and SDMX (XML) formats. Both return a link to the ProductId (PID) supplied in the URL. The CSV version also lets users select either the English (en) or French (fr) versions. The example uses English as the desired output language for CSV. In the case of accessing an SDMX full table download, it does not require a language selection due to being a bilingual format.
  """
  get_url = "https://www150.statcan.gc.ca/t1/wds/rest/getFullTableDownloadCSV/14100287/en"
  # POST URL not found in file
  return f"API logic not implemented yet for getFullTableDownloadCSV."

def getFullTableDownloadSDMX():
  """
  GET URL:
  """
  get_url = "https://www150.statcan.gc.ca/t1/wds/rest/getFullTableDownloadSDMX/14100287"
  # POST URL not found in file
  return f"API logic not implemented yet for getFullTableDownloadSDMX."

def getSeriesInfoFromCubePidCoord():
  """
  Users can also request series metadata either by CubePidCoord or Vector as seen earlier using getSeriesInfoFromVector
  """
  # GET URL not found in file
  post_url = "https://www150.statcan.gc.ca/t1/wds/rest/getSeriesInfoFromCubePidCoord"
  return f"API logic not implemented yet for getSeriesInfoFromCubePidCoord."

def getSeriesInfoFromVector():
  """
  POST URL:
  """
  # GET URL not found in file
  post_url = "https://www150.statcan.gc.ca/t1/wds/rest/getSeriesInfoFromVector"
  return f"API logic not implemented yet for getSeriesInfoFromVector."


Api tests

In [19]:
import asyncio
import aiohttp
import json
import sys
import nest_asyncio

# Apply nest_asyncio to allow nested event loops
nest_asyncio.apply()

async def test_with_official_formats():
    """Test StatCan WDS API using formats from the user guide."""
    base_url = "https://www150.statcan.gc.ca/t1/wds/rest"

    # Test cases exactly as shown in the user guide
    test_cases = [
        {
            "name": "Get CPI data with proper format",
            "endpoint": "getDataFromVectorsAndLatestNPeriods",
            "payload": {"vectorId":1810000101, "latestN":3}
        },
        {
            "name": "Get CPI cube metadata",
            "endpoint": "getCubeMetadata",
            "payload": {"productId": "1810000101"}
        },
        {
            "name": "Get CPI cube metadata (numeric productId)",
            "endpoint": "getCubeMetadata",
            "payload": {"productId": 1810000101}
        },
        {
            "name": "Get series info for CPI",
            "endpoint": "getSeriesInfoFromVector",
            "payload": {"vectorId": "1810000101"}
        },
        {
            "name": "Get data from cube coordinate",
            "endpoint": "getDataFromCubePidCoord",
            "payload": {"productId": "1810000101", "coordinate": ["1.1.1", "1.1"], "latestN": 5}
        },
        {
            "name": "Get series info from cube coordinate",
            "endpoint": "getSeriesInfoFromCubePidCoord",
            "payload": {"productId": "1810000101", "coordinate": ["1.1.1", "1.1"]}
        }
    ]

    async with aiohttp.ClientSession() as session:
        for test in test_cases:
            print(f"\n===== Testing {test['name']} =====")
            url = f"{base_url}/{test['endpoint']}"
            print(f"URL: {url}")
            print(f"Payload: {json.dumps(test['payload'])}")

            try:
                async with session.post(url, json=test['payload']) as response:
                    print(f"Status: {response.status}")
                    try:
                        data = await response.json()
                        if isinstance(data, list) and len(data) > 0:
                            print(f"Response (array, first item): {json.dumps(data[0])[:500]}")
                        else:
                            print(f"Response (first 500 chars): {json.dumps(data)[:500]}")
                    except:
                        text = await response.text()
                        print(f"Text response (first 200 chars): {text[:200]}")
            except Exception as e:
                print(f"Error: {e}")

if __name__ == "__main__":
    asyncio.run(test_with_official_formats())


===== Testing Get CPI data with proper format =====
URL: https://www150.statcan.gc.ca/t1/wds/rest/getDataFromVectorsAndLatestNPeriods
Payload: {"vectorId": 1810000101, "latestN": 3}
Status: 406
Response (first 500 chars): {"message": "JSON syntax error, please refer to the manual to check the input JSON content"}

===== Testing Get CPI cube metadata =====
URL: https://www150.statcan.gc.ca/t1/wds/rest/getCubeMetadata
Payload: {"productId": "1810000101"}
Status: 406
Response (first 500 chars): {"message": "JSON syntax error, please refer to the manual to check the input JSON content"}

===== Testing Get CPI cube metadata (numeric productId) =====
URL: https://www150.statcan.gc.ca/t1/wds/rest/getCubeMetadata
Payload: {"productId": 1810000101}
Status: 406
Response (first 500 chars): {"message": "JSON syntax error, please refer to the manual to check the input JSON content"}

===== Testing Get series info for CPI =====
URL: https://www150.statcan.gc.ca/t1/wds/rest/getSeriesInfoFromVector
P

In [3]:
%%bash

# Corrected Command 1: With Content-Type header and Array payload
echo "--- Testing getDataFromVectorsAndLatestNPeriods ---"
curl -X POST "https://www150.statcan.gc.ca/t1/wds/rest/getDataFromVectorsAndLatestNPeriods" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-d '[{"vectorid": ["v41690973"], "latestN": 5}]' \
-v

echo -e "\n\n--- Testing getCubeMetadata ---"
# Corrected Command 2: With Content-Type header and Array payload (using numeric ProductID)
curl -X POST "https://www150.statcan.gc.ca/t1/wds/rest/getCubeMetadata" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-d '[{"productId": 18100004}]' \
-v # Use a valid Product ID for CPI like 18100004

# Corrected Command 3: Using the specific example ProductID
echo -e "\n\n--- Testing getCubeMetadata with Example PID ---"
curl -X POST "https://www150.statcan.gc.ca/t1/wds/rest/getCubeMetadata" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-d '[{"productId": 35100003}]' \
-v

--- Testing getDataFromVectorsAndLatestNPeriods ---
{"message":"JSON syntax error, please refer to the manual to check the input JSON content"}

--- Testing getCubeMetadata ---
[{"status":"SUCCESS","object":{"responseStatusCode":0,"productId":"18100004","cansimId":"326-0020","cubeTitleEn":"Consumer Price Index, monthly, not seasonally adjusted","cubeTitleFr":"Indice des prix à la consommation mensuel, non désaisonnalisé","cubeStartDate":"1914-01-01","cubeEndDate":"2025-02-01","frequencyCode":6,"nbSeriesCube":2139,"nbDatapointsCube":1116823,"releaseTime":"2025-03-18T08:30","archiveStatusCode":"2","archiveStatusEn":"CURRENT - a cube available to the public and that is current","archiveStatusFr":"ACTIF - un cube qui est disponible au public et qui est toujours mise a jour","subjectCode":["1802","451003"],"surveyCode":["2301"],"dimension":[{"dimensionPositionId":1,"dimensionNameEn":"Geography","dimensionNameFr":"Géographie","hasUom":false,"member":[{"memberId":2,"parentMemberId":null,"memb

Note: Unnecessary use of -X or --request, POST is already inferred.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 167.44.105.20:443...
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* Connected to www150.statcan.gc.ca (167.44.105.20) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS header, Certificate Status (22):
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [91 bytes data]
* TLSv1.2 (IN), TLS header, Certificate Status (22):
{ [5 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [4

In [None]:
!cd mcp-statcan/docs/references/api_sections/ && cat getAllCubesList.txt

In [4]:
import asyncio
import json
import logging
from typing import Any, Dict, List, Optional, Union
import httpx

# Initialize logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Constants
BASE_URL = "https://www150.statcan.gc.ca/t1/wds/rest"

# Reusable HTTP client
async def get_statcan_client() -> httpx.AsyncClient:
    """Returns a configured httpx client for StatCan API requests."""
    return httpx.AsyncClient(
        base_url=BASE_URL,
        timeout=30.0,
        headers={"Accept": "application/json", "User-Agent": "StatCan-Jupyter-Tests/1.0"}
    )

async def make_get_request(endpoint: str, params: Optional[Dict[str, Any]] = None) -> Dict[str, Any]:
    """Make a GET request to the StatCan API."""
    client = await get_statcan_client()
    try:
        response = await client.get(endpoint, params=params)
        response.raise_for_status()
        result = response.json()
        await client.aclose()
        return result
    except httpx.HTTPError as e:
        logger.error(f"HTTP error during GET request to {endpoint}: {e}")
        await client.aclose()
        return {"status": "ERROR", "message": str(e)}
    except Exception as e:
        logger.error(f"Unexpected error during GET request to {endpoint}: {e}")
        await client.aclose()
        return {"status": "ERROR", "message": str(e)}

async def make_post_request(endpoint: str, data: Any) -> Dict[str, Any]:
    """Make a POST request to the StatCan API."""
    client = await get_statcan_client()
    try:
        response = await client.post(endpoint, json=data)
        response.raise_for_status()
        result = response.json()
        await client.aclose()
        return result
    except httpx.HTTPError as e:
        logger.error(f"HTTP error during POST request to {endpoint}: {e}")
        await client.aclose()
        return {"status": "ERROR", "message": str(e)}
    except Exception as e:
        logger.error(f"Unexpected error during POST request to {endpoint}: {e}")
        await client.aclose()
        return {"status": "ERROR", "message": str(e)}

# API Methods Implementation - Modified for Jupyter

async def get_changed_series_list():
    """Get the list of series that have changed today."""
    result = await make_get_request("/getChangedSeriesList")
    return result

async def get_changed_cube_list(date: str):
    """Get the list of data tables/cubes that changed on a specific date."""
    result = await make_get_request(f"/getChangedCubeList/{date}")
    return result

async def get_cube_metadata(product_id: int):
    """Get metadata for a specific statistical data cube/table."""
    data = [{"productId": product_id}]
    result = await make_post_request("/getCubeMetadata", data)
    return result

async def get_series_info_from_vector(vector_id: int):
    """Get information about a specific statistical data series using its vector ID."""
    data = [{"vectorId": vector_id}]
    result = await make_post_request("/getSeriesInfoFromVector", data)
    return result

async def get_series_info_from_cube_pid_coord(product_id: int, coordinate: str):
    """Get information about a specific statistical data series using product ID and coordinate."""
    data = [{"productId": product_id, "coordinate": coordinate}]
    result = await make_post_request("/getSeriesInfoFromCubePidCoord", data)
    return result

async def get_all_cubes_list():
    """Get a comprehensive inventory of all data tables available through the Statistics Canada API."""
    result = await make_get_request("/getAllCubesList")
    return result

async def get_all_cubes_list_lite():
    """Get a lightweight inventory of all data tables available through the Statistics Canada API."""
    result = await make_get_request("/getAllCubesListLite")
    return result

async def get_changed_series_data_from_vector(vector_id: int):
    """Get data for a specific statistical series that has changed, using its vector ID."""
    data = [{"vectorId": vector_id}]
    result = await make_post_request("/getChangedSeriesDataFromVector", data)
    return result

async def get_changed_series_data_from_cube_pid_coord(product_id: int, coordinate: str):
    """Get data for a specific statistical series that has changed, using product ID and coordinate."""
    data = [{"productId": product_id, "coordinate": coordinate}]
    result = await make_post_request("/getChangedSeriesDataFromCubePidCoord", data)
    return result

async def get_data_from_vectors_and_latest_n_periods(vector_id: int, latest_n: int):
    """Get the latest N periods of data for a specific statistical series using its vector ID."""
    data = [{"vectorId": vector_id, "latestN": latest_n}]
    result = await make_post_request("/getDataFromVectorsAndLatestNPeriods", data)
    return result

async def get_data_from_cube_pid_coord_and_latest_n_periods(product_id: int, coordinate: str, latest_n: int):
    """Get the latest N periods of data for a specific statistical series using product ID and coordinate."""
    data = [{"productId": product_id, "coordinate": coordinate, "latestN": latest_n}]
    result = await make_post_request("/getDataFromCubePidCoordAndLatestNPeriods", data)
    return result

async def get_bulk_vector_data_by_range(vector_ids: List[str], start_date: str, end_date: str):
    """Get data for multiple statistical series within a date range based on release dates."""
    data = {
        "vectorIds": vector_ids,
        "startDataPointReleaseDate": start_date,
        "endDataPointReleaseDate": end_date,
    }
    result = await make_post_request("/getBulkVectorDataByRange", data)
    return result

async def get_data_from_vector_by_reference_period_range(vector_ids: List[str], start_ref_period: str, end_ref_period: str):
    """Get data for multiple statistical series within a reference period range."""
    vector_ids_param = ",".join([f'"{vid}"' for vid in vector_ids])
    params = {
        "vectorIds": vector_ids_param,
        "startRefPeriod": start_ref_period,
        "endReferencePeriod": end_ref_period,
    }
    result = await make_get_request("/getDataFromVectorByReferencePeriodRange", params)
    return result

async def get_full_table_download_csv(product_id: int, language: str = "en"):
    """Get a download link for a complete statistical table in CSV format."""
    if language not in ["en", "fr"]:
        return {"status": "ERROR", "message": "Language must be 'en' or 'fr'"}

    result = await make_get_request(f"/getFullTableDownloadCSV/{product_id}/{language}")
    return result

async def get_full_table_download_sdmx(product_id: int):
    """Get a download link for a complete statistical table in SDMX (XML) format."""
    result = await make_get_request(f"/getFullTableDownloadSDMX/{product_id}")
    return result

async def get_code_sets():
    """Get all code sets used by Statistics Canada for describing data."""
    result = await make_get_request("/getCodeSets")
    return result

async def search_tables_by_title(search_term: str):
    """Search for statistical tables by title."""
    # First get all tables
    all_tables = await make_get_request("/getAllCubesListLite")

    # Check if the response is an error
    if isinstance(all_tables, dict) and all_tables.get("status") == "ERROR":
        return all_tables

    # Ensure we're working with a list of tables
    tables_list = all_tables
    if isinstance(all_tables, dict) and "object" in all_tables:
        tables_list = all_tables.get("object", [])

    # Filter tables by search term
    search_term = search_term.lower()
    matching_tables = []

    for table in tables_list:
        title_en = table.get("cubeTitleEn", "").lower()
        title_fr = table.get("cubeTitleFr", "").lower()

        if search_term in title_en or search_term in title_fr:
            matching_tables.append(table)

    result = {"status": "SUCCESS", "object": matching_tables}
    return result

async def get_data_with_citation(vector_id: int, latest_n: int = 5):
    """Get data for a statistical series with proper citation information."""
    # Get series info
    series_info_data = [{"vectorId": vector_id}]
    series_info = await make_post_request("/getSeriesInfoFromVector", series_info_data)

    if isinstance(series_info, list) and series_info[0].get("status") != "SUCCESS":
        return series_info

    # Get cube metadata
    product_id = series_info[0]["object"]["productId"]
    cube_metadata_data = [{"productId": product_id}]
    cube_metadata = await make_post_request("/getCubeMetadata", cube_metadata_data)

    if isinstance(cube_metadata, list) and cube_metadata[0].get("status") != "SUCCESS":
        return cube_metadata

    # Get actual data
    data_request = [{"vectorId": vector_id, "latestN": latest_n}]
    data = await make_post_request("/getDataFromVectorsAndLatestNPeriods", data_request)

    if isinstance(data, list) and data[0].get("status") != "SUCCESS":
        return data

    # Create citation
    metadata = cube_metadata[0]["object"]
    series = series_info[0]["object"]

    citation = {
        "data": data[0],
        "citation": {
            "table_id": metadata.get("productId"),
            "cansim_id": metadata.get("cansimId", ""),
            "table_title_en": metadata.get("cubeTitleEn", ""),
            "table_title_fr": metadata.get("cubeTitleFr", ""),
            "series_title_en": series.get("SeriesTitleEn", ""),
            "series_title_fr": series.get("SeriesTitleFr", ""),
            "source": "Statistics Canada",
            "url": f"https://www150.statcan.gc.ca/t1/tbl/en/{metadata.get('productId')}-eng.htm",
            "release_time": data[0]["object"]["vectorDataPoint"][0].get("releaseTime", ""),
            "citation_text": (
                f"Statistics Canada. Table {metadata.get('cansimId', '')}: "
                f"{metadata.get('cubeTitleEn', '')}. "
                f"Retrieved on {data[0]['object']['vectorDataPoint'][0].get('releaseTime', '').split('T')[0]} "
                f"from https://www150.statcan.gc.ca/t1/tbl/en/{metadata.get('productId')}-eng.htm"
            )
        }
    }

    return citation

# Function to run and display the results of a test function
async def run_test(test_function, *args, **kwargs):
    """Run a test function and display its results."""
    print(f"Testing {test_function.__name__}...")
    try:
        result = await test_function(*args, **kwargs)
        print(json.dumps(result, indent=2))
        print("-" * 80)
        return result
    except Exception as e:
        print(f"Error: {str(e)}")
        print("-" * 80)
        return None

# Example test cases you can use in your Jupyter notebook
# Copy the following code to cells as needed

# Sample Test 1: Get changed series list
# await run_test(get_changed_series_list)

# Sample Test 2: Get changed cube list for a specific date
# await run_test(get_changed_cube_list, "2023-03-15")

# Sample Test 3: Get metadata for a specific cube
# await run_test(get_cube_metadata, 35100003)

# Sample Test 4: Get series info for a specific vector
# await run_test(get_series_info_from_vector, 32164132)

# Sample Test 5: Get the latest 5 data points for a vector
# await run_test(get_data_from_vectors_and_latest_n_periods, 32164132, 5)

# Sample Test 6: Search for tables with "employment" in the title
# await run_test(search_tables_by_title, "employment")

# Sample Test 7: Get data with citation
# await run_test(get_data_with_citation, 32164132)

In [23]:
%%bash


fatal: pathspec 'Untitled2.ipynb' did not match any files


CalledProcessError: Command 'b'git add Untitled2.ipynb\n'' returned non-zero exit status 128.