In [3]:
import textwrap
from lyon_common import create_chain, report_on_message
from question_generator_model import SingleSelection, Code, AnyQuestion
from langchain.output_parsers import PydanticOutputParser

**NOTE** I have updated `lyon_common.py` so that the `create_chain` function accepts a few more arguments:

- `model_name` now defaults to the GPT 3.5 turbo model from 2023-11-06 
- `model_kwargs`: this is a dictionary with OpenAI specific parameters we can set. The default value is now to instruct OpenAI to use the [new `json_object` response format ](https://platform.openai.com/docs/guides/text-generation/json-mode), which modifies the available tokens so that at each step only valid JSON can be produced.
- `verbose` is a new keyword argument that is set to `False` by default.

This notebook shows how we can use these updates to generate valid Pydantic ready JSON

In [4]:
def create_system_prompt(pydantic_object):
    common_system_prompt = textwrap.dedent("""
    You are a smart, helpful teaching assistant chatbot named Callisto.

    You are an expert Python programmer and have used all the most popular
    libraries for data analysis, machine learning, and artificial intelligence.

    You assist professors that teach courses about Python, data science, and machine learning
    to graduate students.

    Your task is to help professors produce practice questions to help students solidify 
    their understanding of specific topics

    In your conversations with a professor you will be given a topic (string) and an
    expected difficulty level (integer)
    
    The difficulty will be a number between 1 and 3, with 1 corresponding to a request 
    for an easy question, and 3 for the most difficult question.
    
    If the professor asks you for another question and does not specify either a new topic 
    or a new difficulty, you must use the previous topic or difficulty.

    You are encouraged to use any tools available to look up relevant information, only
    if necessary.

    Your responses must always exactly match the specified JSON format with no extra words or content.

    You must always produce exactly one JSON object.
    
    {format_instructions}
    """)

    parser = PydanticOutputParser(pydantic_object=pydantic_object)
    return common_system_prompt.format(format_instructions=parser.get_format_instructions())
    

In [5]:
def generate_and_parse_question(pydantic_model, query):
    rag_chain = create_chain(create_system_prompt(pydantic_model), temperature=0.1, verbose=True, model_name="gpt-4-1106-preview")
    response = rag_chain(query)
    report_on_message(response)  # print a summary of what was produced
    parser = PydanticOutputParser(pydantic_object=pydantic_model)
    return parser.parse(response["output"])

In [6]:
generate_and_parse_question(Code, "topic: pandas time series rolling moments\ndifficulty: 2")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `search_course_content` with `pandas time series rolling moments`


[0m[36;1m[1;3m[Document(page_content='To do this operation, pandas starts at each row (date) then looks *backwards*\nthe specified number of periods (here 2 days) and then applies some aggregation\nfunction (`mean`) on all the data in that window.\n\nIf pandas cannot look back the full length of the window (e.g. when working on\nthe first row), it fills as much of the window as possible and then does the\noperation. Notice that the value at 2014-05-01 is the same in both DataFrames.\n\nBelow, we see a visual depiction of the rolling maximum on a 21 day window for\nthe whole dataset.\n\n```python\nfig, ax = plt.subplots(figsize=(14, 8))\nbtc_usd["Open"].plot(ax=ax, linestyle="--", alpha=0.8)\nbtc_usd.rolling("21d").max()["Open"].plot(ax=ax, alpha=0.8, linewidth=3)\nax.legend(["Original", "21 day max"]);\n```\n\nWe can also ask pandas to `apply` c

Given a time series of daily closing stock prices in a DataFrame `stock_prices`, with a `DateTimeIndex` and a column named `Close`, write a function `calculate_rolling_moments` that calculates the rolling mean and rolling standard deviation over a 30-day window and adds these as new columns to the DataFrame. The new columns should be named `Rolling_Mean` and `Rolling_Std` respectively.

Your function should handle cases where there are fewer than 30 days of data available by computing the mean and standard deviation over the available data in the window.

```python
def calculate_rolling_moments(stock_prices):
    # Your code here
    pass
```

**Solution**

```python
def calculate_rolling_moments(stock_prices):
    stock_prices['Rolling_Mean'] = stock_prices['Close'].rolling('30D').mean()
    stock_prices['Rolling_Std'] = stock_prices['Close'].rolling('30D').std()
```

**Test Suite**

```python
import pandas as pd
from datetime import datetime

# Sample data
stock_prices = pd.DataFrame({
    'Close': [100, 101, 102, 103, 104, 105, 106, 107, 108, 109],
}, index=pd.date_range(start='2022-01-01', periods=10, freq='D'))

def calculate_rolling_moments(stock_prices):
    stock_prices['Rolling_Mean'] = stock_prices['Close'].rolling('30D').mean()
    stock_prices['Rolling_Std'] = stock_prices['Close'].rolling('30D').std()

calculate_rolling_moments(stock_prices)

assert 'Rolling_Mean' in stock_prices.columns, 'Rolling_Mean column missing'
assert 'Rolling_Std' in stock_prices.columns, 'Rolling_Std column missing'
assert stock_prices.loc[stock_prices.index[0], 'Rolling_Mean'] == stock_prices.loc[stock_prices.index[0], 'Close'], 'Incorrect rolling mean calculation for first element'
assert pd.isna(stock_prices.loc[stock_prices.index[0], 'Rolling_Std']), 'Rolling standard deviation should be NaN for the first element'
assert stock_prices.loc[stock_prices.index[-1], 'Rolling_Mean'] == stock_prices['Close'].iloc[-30:].mean(), 'Incorrect rolling mean calculation at the end of the series'
assert stock_prices.loc[stock_prices.index[-1], 'Rolling_Std'] == stock_prices['Close'].iloc[-30:].std(), 'Incorrect rolling standard deviation calculation at the end of the series'
```