# INDIAN IT Companies: Revenue VS Stock Price Analysis

The Indian IT sector is a major contributor to the country’s economy, with companies like TCS, Infosys, and Wipro playing a significant role in global technology services. 

## GOAL
This project investigates the relationship between revenue growth and stock price movements of major Indian IT firms over the past decade. By analyzing historical financial data and stock market trends, this study aims to determine how strongly revenue growth influences stock performance. Additionally, the project examines the impact of key events—such as earnings reports, global recessions, and AI adoption—on IT stock prices.

## Research Questions:
1. How strongly does revenue growth impact stock prices of major Indian IT firms?
2. How do major events (earnings reports, global recessions, AI adoption) influence IT stock prices?

***
This project will analyze **three large-cap and two mid-cap Indian IT companies**, selected for their readily available, regular, and detailed financial reports. This ensures reliable data for assessing the correlation between revenue growth and stock price movements.

#### LARGE-CAP IT FIRMS (Market Cap < \\$10 Billion)
- Tata Consultancy Services (TCS)
- Infosys
- Wipro

#### MID-CAP IT FIRMS (\\$2 Billion < Market Cap < \\$10 Billion)
- Persistent Systems
- Mphasis
***

In [1]:
import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
import time

***

### DATA COLLECTION
The revenues and revenue growth rate of the companies was collected from respective annual reports

#### LARGE CAPS

In [2]:
tickets = ["TCS.NS", "INFY.NS", "WIPRO.NS"]
start_date = "2013-04-01"
end_date = "2025-03-31"

In [3]:
data = yf.download(tickets,start=start_date,end=end_date)

YF.download() has changed argument auto_adjust default to True


[*********************100%***********************]  3 of 3 completed

3 Failed downloads:
['INFY.NS', 'WIPRO.NS', 'TCS.NS']: YFRateLimitError('Too Many Requests. Rate limited. Try after a while.')


In [5]:
closing_prices = data['Close']
yearly_prices = closing_prices.resample('YS-APR').last()
returns = yearly_prices.pct_change() * 100
returns.columns = ["TCS_Return", "INFY_Return", "WIPRO_Return"]
result = pd.concat([yearly_prices, returns], axis=1)
result = result.dropna()
result

Unnamed: 0_level_0,INFY.NS,TCS.NS,WIPRO.NS,TCS_Return,INFY_Return,WIPRO_Return
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2014-04-01,426.817047,1054.946411,105.917068,38.063467,21.59567,18.854461
2015-04-01,480.29425,1057.828857,98.330391,12.529304,0.273232,-7.162847
2016-04-01,412.069733,1039.67627,90.426613,-14.204733,-1.716023,-8.037981
2017-04-01,469.690002,1241.496216,99.243706,13.983136,19.411807,9.750551
2018-04-01,632.609375,1768.772461,120.578621,34.686574,42.471031,21.4975
2019-04-01,570.777283,1680.097778,93.478172,-9.774135,-5.013346,-22.475335
2020-04-01,1247.051636,2957.091064,197.704391,118.483053,76.007081,111.497922
2021-04-01,1772.524658,3517.693115,283.479187,42.137231,18.957889,43.385377
2022-04-01,1356.077026,3113.397705,178.304642,-23.494603,-11.493197,-37.101329
2023-04-01,1460.432129,3820.962891,235.37294,7.695367,22.726463,32.006064


In [6]:
# result.to_csv("large_cap_stock.csv")

In [36]:
df = pd.read_excel("Largecap_rev.xlsx")
df

Unnamed: 0,COMPANY,TIME PERIOD,REVENUE (in Cr),REVENUE GROWTH RATE
0,INFOSYS,2024-04-01,153670,4.7
1,INFOSYS,2023-04-01,146767,20.6
2,INFOSYS,2022-04-01,121641,21.0
3,INFOSYS,2021-04-01,100472,10.6
4,INFOSYS,2020-04-01,90791,9.816752
5,INFOSYS,2019-04-01,82675,17.23292
6,INFOSYS,2018-04-01,70522,2.975878
7,INFOSYS,2017-04-01,68484,9.677936
8,INFOSYS,2016-04-01,62441,17.108348
9,INFOSYS,2015-04-01,53319,6.355095


***

#### SMALL CAP

In [3]:
tickets=["PERSISTENT.NS","MPHASIS.NS"]
start_date = "2013-04-01"
end_date = "2025-03-31"

In [4]:
data = yf.download(tickets,start=start_date,end=end_date)

YF.download() has changed argument auto_adjust default to True


[*********************100%***********************]  1 of 2 completed

2 Failed downloads:
['PERSISTENT.NS', 'MPHASIS.NS']: YFRateLimitError('Too Many Requests. Rate limited. Try after a while.')


#### Failed to get data from yfinance library, trying to get data using web scraping

In [6]:
import bs4
import requests

In [28]:
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36",
    "Referer": "https://www.google.com/"
}
website = requests.get('https://finance.yahoo.com/quote/PERSISTENT.NS/history/?period1=1364688000&period2=1740155246&frequency=1mo', headers=headers)
soup = bs4.BeautifulSoup(website.text, "lxml")

In [29]:
tables = soup.select('table.yf-1jecxey.noDl')

In [34]:
print(f"Total tables found: {len(tables)}")

if not tables:
    print("No tables found on the webpage. Check the HTML structure or the URL.")
else:
    table = tables[0]
    headers = [header.text.strip() for header in table.find_all("th")]
    rows = []
    for row in table.find_all("tr")[1:]:
        cols = [col.text.strip() for col in row.find_all("td")]
        if cols:
            rows.append(cols)
rows

Total tables found: 1


[['Feb 21, 2025',
  '2,640.00',
  '2,650.95',
  '2,555.50',
  '2,567.55',
  '2,567.55',
  '248,632'],
 ['Feb 1, 2025',
  '2,790.00',
  '2,896.70',
  '2,507.00',
  '2,637.25',
  '2,637.25',
  '5,490,993'],
 ['Jan 1, 2025',
  '2,859.45',
  '3,078.40',
  '2,736.50',
  '2,867.95',
  '2,867.95',
  '16,796,283'],
 ['Dec 1, 2024',
  '2,974.95',
  '3,237.95',
  '2,812.50',
  '2,847.20',
  '2,847.20',
  '9,756,722'],
 ['Nov 1, 2024',
  '2,885.00',
  '3,047.85',
  '2,751.05',
  '2,974.55',
  '2,974.55',
  '12,023,124'],
 ['Oct 1, 2024',
  '3,034.80',
  '3,144.75',
  '2,836.60',
  '2,879.55',
  '2,879.55',
  '21,246,148'],
 ['Sep 1, 2024',
  '3,104.95',
  '3,187.80',
  '2,918.55',
  '3,010.40',
  '3,010.40',
  '13,055,323'],
 ['Aug 1, 2024',
  '2,904.60',
  '3,153.00',
  '2,589.35',
  '3,104.95',
  '3,104.95',
  '14,946,217'],
 ['Jul 10, 2024', '55.00 Dividend'],
 ['Jul 1, 2024',
  '2,481.00',
  '3,080.95',
  '2,457.80',
  '2,892.50',
  '2,831.32',
  '28,083,038'],
 ['Jun 1, 2024',
  '2,368.00',


In [31]:
import csv
filename = "persistent_stk.csv"
with open(filename, "w", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(headers)  # Write headers
    writer.writerows(rows)  # Write rows

print(f"Data saved to {filename}")

Data saved to persistent_stk.csv


In [32]:
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36",
    "Referer": "https://www.google.com/"
}
website = requests.get('https://finance.yahoo.com/quote/MPHASIS.NS/history/?frequency=1mo&period1=1025495100&period2=1740157674', headers=headers)
soup = bs4.BeautifulSoup(website.text, "lxml")

In [33]:
tables = soup.select('table.yf-1jecxey.noDl')
print(f"Total tables found: {len(tables)}")

Total tables found: 1


In [35]:
if not tables:
    print("No tables found on the webpage. Check the HTML structure or the URL.")
else:
    table = tables[0]
    headers = [header.text.strip() for header in table.find_all("th")]
    rows = []
    for row in table.find_all("tr")[1:]:
        cols = [col.text.strip() for col in row.find_all("td")]
        if cols:
            rows.append(cols)
rows

[['Feb 21, 2025',
  '2,640.00',
  '2,650.95',
  '2,555.50',
  '2,567.55',
  '2,567.55',
  '248,632'],
 ['Feb 1, 2025',
  '2,790.00',
  '2,896.70',
  '2,507.00',
  '2,637.25',
  '2,637.25',
  '5,490,993'],
 ['Jan 1, 2025',
  '2,859.45',
  '3,078.40',
  '2,736.50',
  '2,867.95',
  '2,867.95',
  '16,796,283'],
 ['Dec 1, 2024',
  '2,974.95',
  '3,237.95',
  '2,812.50',
  '2,847.20',
  '2,847.20',
  '9,756,722'],
 ['Nov 1, 2024',
  '2,885.00',
  '3,047.85',
  '2,751.05',
  '2,974.55',
  '2,974.55',
  '12,023,124'],
 ['Oct 1, 2024',
  '3,034.80',
  '3,144.75',
  '2,836.60',
  '2,879.55',
  '2,879.55',
  '21,246,148'],
 ['Sep 1, 2024',
  '3,104.95',
  '3,187.80',
  '2,918.55',
  '3,010.40',
  '3,010.40',
  '13,055,323'],
 ['Aug 1, 2024',
  '2,904.60',
  '3,153.00',
  '2,589.35',
  '3,104.95',
  '3,104.95',
  '14,946,217'],
 ['Jul 10, 2024', '55.00 Dividend'],
 ['Jul 1, 2024',
  '2,481.00',
  '3,080.95',
  '2,457.80',
  '2,892.50',
  '2,831.32',
  '28,083,038'],
 ['Jun 1, 2024',
  '2,368.00',


In [36]:
filename = "mphasis_stk.csv"
with open(filename, "w", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(headers)
    writer.writerows(rows)

print(f"Data saved to {filename}")

Data saved to mphasis_stk.csv


In [47]:
data = pd.read_csv("persistent_stk.csv", parse_dates=["Date"], dayfirst=True)
data.set_index("Date", inplace=True)
data.head()

Unnamed: 0_level_0,Open,High,Low,Close Close price adjusted for splits.,Adj Close Adjusted close price adjusted for splits and dividend and/or capital gain distributions.,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2025-02-21,5925.00,5937.4,5680.0,5710.3,5710.3,335438.0
2025-02-01,6047.75,6303.95,5388.5,5917.9,5917.9,5476500.0
2025-01-31,20.00 Dividend,,,,,
2025-01-01,6457.65,6527.85,5445.0,6032.6,6012.68,15093715.0
2024-12-01,5905.65,6788.9,5853.65,6457.7,6436.38,9893453.0


In [53]:
closing_prices = data['Close      Close price adjusted for splits.'].str.replace(",", "").astype(float)
yearly_prices = closing_prices.resample('YS-APR').last()
returns = yearly_prices.pct_change() * 100
returns.name = "stk_Return"
result = pd.concat([yearly_prices, returns], axis=1).dropna()
result

Unnamed: 0_level_0,Close Close price adjusted for splits.,stk_Return
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2014-04-01,357.15,36.129745
2015-04-01,381.08,6.700266
2016-04-01,297.85,-21.840558
2017-04-01,347.02,16.50831
2018-04-01,314.77,-9.293412
2019-04-01,275.48,-12.48213
2020-04-01,961.03,248.856541
2021-04-01,2382.65,147.926704
2022-04-01,2304.75,-3.269469
2023-04-01,3984.55,72.884261


In [56]:
data = pd.read_csv("mphasis_stk.csv", parse_dates=["Date"], dayfirst=True)
data.set_index("Date", inplace=True)
data.head()
closing_prices = data['Close      Close price adjusted for splits.'].str.replace(",", "").astype(float)
yearly_prices = closing_prices.resample('YS-APR').last()
returns = yearly_prices.pct_change() * 100
returns.name = "stk_Return"
result2 = pd.concat([yearly_prices, returns], axis=1).dropna()
result2 = result2[result2.index > "2013-04-01"]
result2

Unnamed: 0_level_0,Close Close price adjusted for splits.,stk_Return
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2014-04-01,385.4,-4.60396
2015-04-01,491.8,27.60768
2016-04-01,579.9,17.913786
2017-04-01,837.75,44.464563
2018-04-01,991.1,18.304984
2019-04-01,664.45,-32.958329
2020-04-01,1776.5,167.363985
2021-04-01,3376.85,90.084436
2022-04-01,1795.75,-46.821742
2023-04-01,2388.05,32.983433


In [62]:
result.columns

Index(['Close      Close price adjusted for splits.', 'PERSISTENT_Return'], dtype='object')

In [64]:
result2 = result2.rename(columns={'Close      Close price adjusted for splits.': 'MPHASIS.NS', 'stk_Return': 'MPHASIS_Return'})
result = result.rename(columns={'Close      Close price adjusted for splits.': 'PERSISTENT.NS', 'stk_Return': 'PERSISTENT_Return'})
final = pd.concat([result,result2],axis=1)
final

Unnamed: 0_level_0,PERSISTENT.NS,PERSISTENT_Return,MPHASIS.NS,MPHASIS_Return
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2014-04-01,357.15,36.129745,385.4,-4.60396
2015-04-01,381.08,6.700266,491.8,27.60768
2016-04-01,297.85,-21.840558,579.9,17.913786
2017-04-01,347.02,16.50831,837.75,44.464563
2018-04-01,314.77,-9.293412,991.1,18.304984
2019-04-01,275.48,-12.48213,664.45,-32.958329
2020-04-01,961.03,248.856541,1776.5,167.363985
2021-04-01,2382.65,147.926704,3376.85,90.084436
2022-04-01,2304.75,-3.269469,1795.75,-46.821742
2023-04-01,3984.55,72.884261,2388.05,32.983433


In [65]:
final.to_csv('mid_cap_stock.csv')

***
***