---
title: "Application of R for Finance - Assignment 1"
subtitle: "Data Analysis "
author: "Group 30"
date: "2025-09-27"
format: pdf
---
\newpage
\tableofcontents
\listoffigures
\listoftables
\newpage

# Setup

## Required libraries
Load the following libraries for data analysis and visualization.


In [1]:
library(dplyr)
library(lubridate)


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union



Attaching package: ‘lubridate’


The following objects are masked from ‘package:base’:

    date, intersect, setdiff, union




## Data Frame

Load the dataset **_compustat_food_bev.csv_** into a data frame.


This dataset contains company identifiers, trading information, and classification codes.

Table below summarises the main variables.

| Symbol  | Full Name                        | Description                                         |
|---------|----------------------------------|-----------------------------------------------------|
| GVKEY   | Global Company Key               | Unique number assigned to each company in Compustat |
| iid     | Issue Identifier                 | Code for specific security of a company             |
| datadate| Data Date                        | Date of the trading record.                         |
| tic     | Ticker Symbol                    | Stock ticker symbol of the company                  |
| conm    | Company Name                     | Official registered name of the company             |
| cshtrd  | Shares Traded                    | Number of shares traded during the day              |
| prccd   | Closing Price                    | Daily closing price of the security.                |
| prchd   | High Price                       | Daily highest trading price of the security         |
| prcld   | Low Price                        | Daily lowest trading price of the security          |
| prcod   | Opening Price                    | Daily opening price of the security                 |
| exchg   | Exchange Code                    | Code for stock exchange listing                     |
| sic     | Standard Industrial Classification | Code for primary business industry                  |

In [2]:
# load the data
data <- read.csv("compustat_food_bev.csv")

In [3]:
# inspect the strucutre
head(data)

Unnamed: 0_level_0,GVKEY,iid,datadate,tic,conm,cshtrd,prccd,prchd,prcld,prcod,exchg,sic
Unnamed: 0_level_1,<int>,<int>,<chr>,<chr>,<chr>,<int>,<dbl>,<dbl>,<dbl>,<dbl>,<int>,<int>
1,186785,1,01/09/2020,ARCO,ARCOS DORADOS HOLDINGS INC,813895,4.47,4.54,4.394,4.41,11,5812
2,186785,1,02/09/2020,ARCO,ARCOS DORADOS HOLDINGS INC,518021,4.48,4.522,4.4,4.49,11,5812
3,186785,1,03/09/2020,ARCO,ARCOS DORADOS HOLDINGS INC,947825,4.44,4.525,4.36,4.48,11,5812
4,186785,1,04/09/2020,ARCO,ARCOS DORADOS HOLDINGS INC,534286,4.41,4.49,4.28,4.48,11,5812
5,186785,1,08/09/2020,ARCO,ARCOS DORADOS HOLDINGS INC,669380,4.27,4.415,4.26,4.34,11,5812
6,186785,1,09/09/2020,ARCO,ARCOS DORADOS HOLDINGS INC,1152416,4.38,4.42,4.27,4.27,11,5812


# **Part 1 - Features**

## Indicators Definition

Key indicators used across companies are defined below.

* **Daily Return:**
The percentage change in the value from the previous day, reflecting daily profitability and price variation.
$$
return_{daily} = \frac{(close_t - close_{t-1})}{close_{t-1}}
$$

* **Overnight Return:**
The percentage change from the previous day’s close to the current day’s open, measuring price adjustments that occur outside of trading hours.
$$
return_{overnight} = \frac{(open_t - close_{t-1})}{close_{t-1}}
$$

* **10-Day Momentum Indicator:**
The difference between today's closing price and 10 days ago from the last closing price, tracking short-term trend strength.
$$
momentum_{10\text{-}day} = close_t - close_{t-10}
$$

* **Daily Range:**
The difference between the daily high and low prices,representing intraday volatility.
$$
range_{daily} = high_t - low_t
$$

* **Volume Change:**
The difference in trading volume compared to the previous day, showing Shows in trading activity.
$$
change_{volume} = volume_t - volume_{t-1}
$$

* **Close-Open Change:**
The difference between the daily closing price and the opening price, indicating intraday price direction.
$$
change_{close-open} = close_t - open_t
$$

* **Money Flow Volume Indicator (MFV):**
The flow of the money into and out of the security, estimating buying or selling pressure.
$$
MFV = \frac{((close_t - low_t) - (high_t - close_t))}{(high_t - low_t)} \times volume_t
$$


## Company Analysis

First, each company calculates a subset of four of the above measures, as specified in the assignment brief.

Next, additional features are derived from time-based information. After extracting the month and year from the trading date, indicator performance over specific periods is calculated and key dates associated with extreme values are identified.

### SBUX (Starbucks)

In [None]:
# subset SBUX
data_SBUX <- filter(data, tic == "SBUX")

# calculate indicators
data_SBUX$return_daily <- (data_SBUX$prccd - lag(data_SBUX$prccd, 1)) / lag(data_SBUX$prccd, 1)
data_SBUX$momentumn_10_day <- data_SBUX$prccd - lag(data_SBUX$prccd,10)
data_SBUX$range_daily <- data_SBUX$prchd - data_SBUX$prcld
data_SBUX$MFV <- ((data_SBUX$prccd - data_SBUX$prcld)-(data_SBUX$prchd - data_SBUX$prccd)/(data_SBUX$prchd-data_SBUX$prcld))*data_SBUX$cshtrd

# display results



In [None]:
# indicate month and year
data_SBUX$month <- month(data_SBUX$datadate)
data_SBUX$year <- year(data_SBUX$datadate)

In [None]:
# total trading volume in June 2023
filter(data_SBUX,  month == '6' & year == '2023')

# mean daily return over the entire period

# date with the largest positive high price

# date with the largest positive daily return

GVKEY,iid,datadate,tic,conm,cshtrd,prccd,prchd,prcld,prcod,exchg,sic,return_daily,momentumn_10_day,range_daily,MFV,month,year
<int>,<int>,<chr>,<chr>,<chr>,<int>,<dbl>,<dbl>,<dbl>,<dbl>,<int>,<int>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>


### WEN (Wendy's)

In [None]:
# subset WEN
data_WEN <- filter(data, tic == "WEN")

# calculate indicators
data_WEN$return_daily <- (data_WEN$prccd - lag(data_WEN$prccd, 1)) / lag(data_WEN$prccd, 1)
data_WEN$return_overnight <- (data_WEN$prcod - lag(data_WEN$prccd, 1)) / lag(data_WEN$prccd, 1)
data_WEN$volume_change <- data_WEN$cshtrd - lag(data_WEN$cshtrd, 1)
data_WEN$MFV <- ((data_WEN$prccd - data_WEN$prcld)-(data_WEN$prchd - data_WEN$prccd)/(data_WEN$prchd-data_WEN$prcld))*data_WEN$cshtrd

# display results

In [None]:
# indicate month and year
data_WEN$month <- month(data_WEN$datadate)
data_WEN$year <- year(data_WEN$datadate)

In [None]:
# total trading volume in June 2023

# mean daily return over the entire period

# date with the largest positive high price

# date with the largest positive daily return

### PBPB (Potbelly)

In [None]:
# subset PBPB
data_PBPB <- filter(data, tic == "PBPB")

# calculate indicators
data_PBPB$return_daily <- (data_PBPB$prccd - lag(data_PBPB$prccd, 1)) / lag(data_PBPB$prccd, 1)
data_PBPB$return_overnight <- (data_PBPB$prcod - lag(data_PBPB$prccd, 1)) / lag(data_PBPB$prccd, 1)
data_PBPB$change_close_open <- data_PBPB$prccd - data_PBPB$prcod
data_PBPB$MFV <- ((data_PBPB$prccd - data_PBPB$prcld)-(data_PBPB$prchd - data_PBPB$prccd)/(data_PBPB$prchd-data_PBPB$prcld))*data_PBPB$cshtrd

# display results

In [None]:
# indicate month and year

In [None]:
# total trading volume in June 2023

# mean daily return over the entire period

# date with the largest positive high price

# date with the largest positive daily return

### Another

In [None]:
# subset Another

# calculate indicators

# display results


In [None]:
# indicate month and year

In [None]:
# total trading volume in June 2023

# mean daily return over the entire period

# date with the largest positive high price

# date with the largest positive daily return

## Summary