## Importing Stock data

In [17]:
import pandas as pd
df_st = pd.read_csv("stock-data.csv")
df_st

Unnamed: 0,Stocks,Sub-Sector,Market Cap,Close Price,PE Ratio,5Y CAGR,Beta,Alpha,5Y HISREVGTH
0,RELIANCE,Oil & Gas - Refining & Marketing,1647964.09,2473.30,33.54,38.02,1.14,14.90,12.03
1,TCS,IT Services & Consulting,1285679.29,3475.70,39.64,26.75,0.63,13.71,8.41
2,HDFCBANK,Private Banks,852892.49,1539.40,26.79,20.51,1.10,-4.95,15.95
3,INFY,IT Services & Consulting,745692.88,1779.40,38.54,31.07,0.61,25.13,9.39
4,HINDUNILVR,FMCG - Household Products,563760.93,2399.40,70.51,24.48,0.24,6.54,7.77
...,...,...,...,...,...,...,...,...,...
4551,RPPINFRPP,-,-,32.70,-,-,-,-,-
4552,AIRTELPP,-,-,364.90,-,-,-,-,-
4553,SAPPHIRE,-,-,1211.55,0.00,-,-,-,45.72
4554,PAYTM,-,-,1560.80,0.00,-,-,-,27.54


### Removing rows that do not have Market Cap Value

In [18]:
df_st1 = df_st[df_st['Market Cap'] != '-']

### Converting string values to numeric in Market Cap column

In [19]:
df_st1['Market Cap']=pd.to_numeric(df_st1['Market Cap'])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


### Sorting values in descending order to divide into small-capped and medium-capped

In [20]:
df_st1 = df_st1.sort_values('Market Cap', ascending = False)

### Taking top 200 companies as large capped companies

In [21]:
df_lc = df_st1[:199]

### Taking Top 200 companies with market cap less than 19000 in medium cap companies

In [22]:
val_mc = 19000
close_val_fn_mc = lambda list_val : abs(list_val - val_mc)
close_val_mc = min(df_st1['Market Cap'], key=close_val_fn_mc)
df_mc = df_st1[list(df_st1['Market Cap']).index(close_val_mc): list(df_st1['Market Cap']).index(close_val_mc) +201 ]

### Taking Top 200 companies with market cap less than 4000 in small cap companies 

In [23]:
val_sc = 4000
close_val_fn_sc = lambda list_val : abs(list_val - val_sc)
close_val_sc = min(df_st1['Market Cap'], key=close_val_fn_sc)
df_sc = df_st1[list(df_st1['Market Cap']).index(close_val_sc): list(df_st1['Market Cap']).index(close_val_sc) +201 ]

### Exporting companies as small-cap, medium-cap and large-cap companies data in csv files

In [24]:
df_sc.to_csv('small-cap.csv', index=False)
df_mc.to_csv('medium-cap.csv', index=False)
df_lc.to_csv('large-cap.csv', index=False)

### Removing invalid entries from '5Y HISREVGTH' and converting it to numeric and considering rows with value for this colum greater than 0

In [25]:
df_lc = df_lc[df_lc['5Y HISREVGTH'] != '-']
df_lc['5Y HISREVGTH'] = pd.to_numeric(df_lc['5Y HISREVGTH'])
df_lc = df_lc[df_lc['5Y HISREVGTH'] > 0]

### Removing invalid entries from 'Beta' and converting it to numeric and considering rows with value for this colum between 0 and 1

In [26]:
df_lc1 = df_lc[df_lc['Beta'] != '-']
df_lc1['Beta']=pd.to_numeric(df_lc1['Beta'])
df_lc1 = df_lc1[df_lc1['Beta'] >= 0]
df_lc1 = df_lc1[df_lc1['Beta'] <= 1]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


### Removing invalid entries from 'Alpha' and converting it to numeric and sorting this new dataframe on basis of 'Alpha' in descending order and taking top 50 tuples

In [27]:
df_lc2 = df_lc1[df_lc1['Alpha'] != '-']
df_lc2['Alpha'] = pd.to_numeric(df_lc2['Alpha'])
df_lc2 = df_lc2.sort_values('Alpha', ascending = False)
df_lc2 = df_lc2[:50]

### Removing invalid entries from '5Y CAGR' and converting it to numeric and sorting on basis of '5Y CAGR' in descending order and taking top 50 tuples

In [28]:
df_lc3 = df_lc1[df_lc1['5Y CAGR'] != '-']
df_lc3['5Y CAGR'] = pd.to_numeric(df_lc3['5Y CAGR'])
df_lc3 = df_lc3.sort_values('5Y CAGR', ascending = False)
df_lc3 = df_lc3[:50]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


### Taking common companies from both above formed dataframes

In [29]:
common = list(set(df_lc2['Stocks']) & set(df_lc3['Stocks']))

### Sorting based on sum of ranks(indices) in both dataframes and getting 6 companies with least sum

In [30]:
sum_in = [0]*len(common)
for i in range(len(common)):
    sum_in[i] = list(df_lc3['Stocks']).index(common[i]) + list(df_lc2['Stocks']).index(common[i])

dict = {'index' : sum_in, 'com_name': common}
df_lc4 = pd.DataFrame(dict)
df_lc4 = df_lc4.sort_values('index', ascending = True)

ls_cmn = list(df_lc4['com_name'][:6])

### Getting data for the retrieved 6 companies from original data

In [31]:
df_lc5 = df_lc[df_lc['Stocks'].isin(ls_cmn)]

### Exporting this 6 companies' data into a csv file

In [32]:
df_lc5.to_csv('top-6-large-cap.csv',index=False)