# MS5114 Assignment 1

In this assignment, you are required to analyze data from Yahoo Finance website using three Python libraries (i.e. `yfinance`, `tensorflow` and `transformers`). Make sure that the required libraries are installed in your Python distribution (use Anaconda Navigator or `pip` command for this purpose). You are to expected have knowledge of following topics to solve the problems.

* data types, operators, conditions, functions
* lists, dictionaries, tuples, data frames
* strings

For details of relevant libraries visit following

* https://aroussi.com/post/python-yahoo-finance
* https://huggingface.co/blog/sentiment-analysis-python

### Name: _replace this text with student name_
### Student Id: _replace this text with student id_

In [None]:
# import required libraries
import yfinance as yf
from transformers import pipeline

  _empty_series = pd.Series()


## Problem 1

Write a function `calc_returns(prices)`. This function will process a list of stock prices and calculate the periodic returns. The function should assume that the oldest price is in `prices[0]` and latest price in `prices[-1]`. The function should use a loop to accumulate a list of returns for periods 1 to n. The periodic rate of return is calculated as the rate of change in price from the previous period, i.e.,

$r_i = \frac{p_i}{p_{i - 1}} - 1$

For example:

```
>>> prices = [100,110,105,112,115]
>>> returns = calc_returns(prices)
>>> print(returns)`
[0.10000000000000009, -0.045454545454545414, 0.06666666666666665, 0.02678571428571419]
```

_Notes_:

* For $n$ stock prices, you will generate a list of $n-1$ periodic returns. There is no return for period $0$.
* The function `calc_prices` should not print any output, but rather creates and returns a list of periodic rates of return.
* When computing with binary floating point numbers, there is a small representational error which might result in an unexpected value in the insignificant digits (e.g., (110 - 100) / 100 gave a result of 0.10000000000000009.) Do not be alarmed by this small error!
* The values in the list of returns will be unformatted floating-point numbers; you can use the `round()` function to round up to 2 decimal points.

In [None]:
# this is an empty fuction
def calc_returns(prices):
    """
    Calculate the periodic returns based on a list of stock prices.

    Args:
    - prices (list of float): List of stock prices

    Returns:
    - returns (list of float): List of periodic returns
    """
    returns = []
    for i in range(1, len(prices)):
        # Calculate periodic rate of return
        ret = (prices[i] / prices[i - 1]) - 1
        returns.append(round(ret, 2))
    return returns

Analyze stock prices and returns for a specific period e.g. 1 week or 2 weeks.

* Using `yfinance` library, load the data about a company's share prices using their stock ticker e.g. "MSFT" for Microsoft.
* Extract the list of closing prices for each day.
* Use the `calc_returns()` function to calculate the list to returns.
* Print both stock prices and returns.
* Explain the trend in stock prices and returns.

_Note: Avoid using ticker for popular companies so that there is no overlap of tickers between students. Look your Yahoo Finance website to find a different company and its ticker_

In [None]:
# code for above analysis goes here
# Load data for a specific company, e.g., "AAPL" for Apple
ticker = "TCS.NS"
stock_data = yf.download(ticker, start="2024-01-01", end="2024-01-29")

# Extract closing prices
closing_prices = stock_data['Close']

# Calculate returns
returns = calc_returns(closing_prices)

print("Stock Prices:")
print(closing_prices)
print("\nReturns:")
print(returns)

# Explanation of trend in stock prices and returns:
# The stock prices show the daily closing prices for the given period.
# The returns represent the percentage change in price from the previous day.
# Positive returns indicate price increases, while negative returns indicate price decreases.
# By analyzing the returns, we can identify the trend in the stock's performance over the period.

[*********************100%%**********************]  1 of 1 completed

Stock Prices:
Date
2024-01-01    3811.100098
2024-01-02    3783.199951
2024-01-03    3691.750000
2024-01-04    3666.800049
2024-01-05    3737.899902
2024-01-08    3678.300049
2024-01-09    3689.899902
2024-01-10    3713.050049
2024-01-11    3735.550049
2024-01-12    3882.800049
2024-01-15    3903.800049
2024-01-16    3861.300049
2024-01-17    3884.600098
2024-01-18    3902.600098
2024-01-19    3943.050049
2024-01-22    3943.050049
2024-01-23    3858.250000
2024-01-24    3841.800049
2024-01-25    3810.300049
Name: Close, dtype: float64

Returns:
[-0.01, -0.02, -0.01, 0.02, -0.02, 0.0, 0.01, 0.01, 0.04, 0.01, -0.01, 0.01, 0.0, 0.01, 0.0, -0.02, -0.0, -0.01]





## Problem 2
Write a function `calc_simple_moving_average(prices, window_size)`. This function will process a list of stock prices and calculate the simple moving average based on a specified window size. The function should assume that the oldest price is in `prices[0]` and latest price in `prices[-1]`. The function should use a loop to calculate a list of moving averages for periods $1$ to $n$. The simple moving average is calculate within a specific window $k$ and shifting throught the list of prices , i.e.,

$a_i = \frac{1}{k} \sum_{i}^{i+k} p_i$

For example:

```
>>> prices = [100,110,105,112,115]
>>> averages = calc_simple_moving_average(prices, 3)
>>> print(averages)`
[105, 109, 110.666666667]
```

_Notes_:

* For $n$ stock prices, you will generate a list of $n-k+1$ averages. There is no average for first $k-1$ prices.
* The function `calc_simple_moving_average` should not print any output, but rather creates and returns a list of averages.
* The values in the list of returns will be unformatted floating-point numbers; you can use the `round()` function to round up to 2 decimal points.


In [None]:
# this is an empty fuction
def calc_simple_moving_average(prices, window_size):
    """
    Calculate the simple moving average based on a specified window size.

    Args:
    - prices (list of float): List of stock prices
    - window_size (int): Size of the moving window

    Returns:
    - averages (list of float): List of moving averages
    """
    averages = []
    for i in range(len(prices) - window_size + 1):
        # Calculate moving average for each window
        average = sum(prices[i:i+window_size]) / window_size
        averages.append(round(average, 2))
    return averages

Analyze stock prices and averages for a specific period e.g. 1 month or 2 months.

* Using `yfinance` library, load the data about a company's share prices using their stock ticker e.g. "MSFT" for Microsoft.
* Extract the list of closing prices for each day.
* Use the `calc_simple_moving_average()` function to calculate the simple moving average.
* Print both stock prices and averages.
* Explain the trend in stock prices and averages.

_Note: You can use the same ticker here as for the previous problem_

In [None]:
# code for above analysis goes here
# Load data for the same company as in Problem 1
# For consistency, let's use the same ticker "AAPL"

# Extract closing prices (already done in Problem 1)
# Calculate simple moving averages
window_size = 5  # Example window size
averages = calc_simple_moving_average(closing_prices, window_size)

print("\nStock Prices:")
print(closing_prices)
print("\nSimple Moving Averages (Window Size={}):".format(window_size))
print(averages)

# Explanation of trend in stock prices and averages:
# The simple moving averages provide a smoothed representation of the stock prices over the specified window size.
# By comparing the moving averages with the actual prices, we can identify trends and potential buy/sell signals.
# When the price is above the moving average, it may indicate an upward trend, and vice versa.


Stock Prices:
Date
2024-01-01    3811.100098
2024-01-02    3783.199951
2024-01-03    3691.750000
2024-01-04    3666.800049
2024-01-05    3737.899902
2024-01-08    3678.300049
2024-01-09    3689.899902
2024-01-10    3713.050049
2024-01-11    3735.550049
2024-01-12    3882.800049
2024-01-15    3903.800049
2024-01-16    3861.300049
2024-01-17    3884.600098
2024-01-18    3902.600098
2024-01-19    3943.050049
2024-01-22    3943.050049
2024-01-23    3858.250000
2024-01-24    3841.800049
2024-01-25    3810.300049
Name: Close, dtype: float64

Simple Moving Averages (Window Size=5):
[3738.15, 3711.59, 3692.93, 3697.19, 3710.94, 3739.92, 3785.02, 3819.3, 3853.61, 3887.02, 3899.07, 3906.92, 3906.31, 3897.75, 3879.29]


## Problem 3

Find out the sentiment of recent news about a company.

* Using `yfinance` library, load the news about a company using their stock ticker e.g. "MSFT" for Microsoft.
* Extract the list of news titles.
* Use the `sentiment_pipeline()` function from `transformers` library to calculate the sentiment for each title.
* Print both news and sentiment.
* Explain the sentiment in news.

_Note: You can use the same ticker here as for the previous problem_

In [None]:
# code for above analysis goes here
def sentiment_analysis(ticker):
    """
    Perform sentiment analysis on recent news about a company using its stock ticker.

    Args:
    - ticker (str): Stock ticker symbol of the company

    Returns:
    - news_sentiments (list of dict): List of dictionaries containing news titles and corresponding sentiment scores
    """
    # Load news data
    company = yf.Ticker(ticker)
    news_df = company.news

    # Extract news titles
    news_titles = news_df[0]['title']
    #news_titless = news_df[0].tolist()

    # Initialize sentiment pipeline
    sentiment_pipeline = pipeline("sentiment-analysis")

    # Analyze sentiment for news title
    news_sentiments = []
    #for title in news_titles:
        # Perform sentiment analysis
    result = sentiment_pipeline(news_titles)
        # Extract sentiment score
    sentiment_score = result[0]['score']
        # Store title and sentiment score in a dictionary
    news_sentiments.append({'title': news_titles, 'sentiment_score': sentiment_score})

    return news_sentiments

In [None]:
news_sentiments = sentiment_analysis(ticker)

print("\nRecent News and Sentiments:")
if isinstance(news_sentiments, str):
    print(news_sentiments)
else:
    for news in news_sentiments:
        print(news['title'], "-", news['sentiment_score'])

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]


Recent News and Sentiments:
SKRR Exploration Inc. Announces Definitive Agreement with X1 Entertainment Group Inc. for the Manson Bay Project, Saskatchewan - 0.8283730149269104
