# Working with a csv from cloud

In this notebook, we use the pandas module to import a csv file from the cloud into ndarray.

In [1]:
import pandas as pd
import numpy as np

Confirm if the URL points to a file that can be read using an internet browser. Modify the parameter "header" as per the way the first line of the csv file is written.

In [2]:
url = 'https://raw.githubusercontent.com/gphanikumar/MM2090/master/scripts/RollList.csv'
df = pd.read_csv(url, header=None)

In [3]:
type(df)

pandas.core.frame.DataFrame

Check the first few rows to see what the data looks like.

In [4]:
print(df.head(10))

          0                 1
0  CE19B089    Sruthi Sreeram
1  ME18B009   Bharath Chandar
2  ME18B020        Aravindh P
3  ME18B027    Rajasundaram M
4  ME18B033        Suganth NN
5  ME18B046          Deepak G
6  ME18B086   Arvind Raghav V
7  ME18B089       Sriharan BS
8  ME18B145   Ashwin Kumar KS
9  ME18B146  Vikas Mahendar K


You can copy the data from dataframe to ndarray.

In [5]:
names = df.to_numpy()

In [6]:
type(names)

numpy.ndarray

In [7]:
names.shape

(78, 2)

In [8]:
names[0:5,1]

array(['Sruthi Sreeram', 'Bharath Chandar', 'Aravindh P',
       'Rajasundaram M', 'Suganth NN'], dtype=object)

You can insert a column into the dataframe. Before that, make sure you have the column of same length to avoid mistmatch. Here, we use random numbers scaled to be between 0 and 40 as the column named "es".

In [9]:
es = 40*np.random.rand(len(names))
es.shape

(78,)

In [10]:
df.insert(2, "es", es, True)

In [11]:
print(df)

           0                  1         es
0   CE19B089     Sruthi Sreeram  39.904213
1   ME18B009    Bharath Chandar   7.466924
2   ME18B020         Aravindh P  27.672600
3   ME18B027     Rajasundaram M  24.771883
4   ME18B033         Suganth NN  12.355439
..       ...                ...        ...
73  MM19B044   Pranav Choudhari  33.391037
74  MM19B045          Aswanth R  38.274226
75  MM19B046  Rishaab Karthik R  33.118704
76  MM19B049       Rohan Korale  23.184136
77  MM19B054      Shreya Smitha   3.846585

[78 rows x 3 columns]


You can save the dataframe to a csv file in the present working directory of the notebook. You can use a relative path but make sure you have write permissions to that folder. 

Uncomment (i.e., remove the # character at the beginning of) the following line and try it in your notebook on your computer. Commenting this cell so that it will render on github.

In [12]:
#df.to_csv("marks.csv", index=False)

Use the ! to prefix any command you want to pass to the shell for execution. Uncomment the following line and try it in your notebook on your computer. Commenting this cell so that it will render on github.

In [13]:
#!ls -l marks.csv

-rw-rw-r-- 1 gphani gphani 3298 May 19 18:31 marks.csv
