# Working with a csv from cloud

In this notebook, we use the pandas module to import a csv file from the cloud into ndarray.

In [1]:
import pandas as pd
import numpy as np

Confirm if the URL points to a file that can be read using an internet browser. Modify the parameter "header" as per the way the first line of the csv file is written.

In [2]:
url = 'https://raw.githubusercontent.com/r3kste/ID2090/main/RollList23.csv'
df = pd.read_csv(url, header=None)

In [3]:
type(df)

Check the first few rows to see what the data looks like.

In [4]:
print(df.head(10))

         0                     1          2       3   \
0    RollNo            First Name  Last Name  Gender   
1  AE23B005       BHAVESH AGARWAL        NaN       M   
2  AE23B010              GUHAAN K        NaN       M   
3  AE23B011          HEMANT MEENA        NaN       M   
4  AE23B013        KISHOREKUMAR S        NaN       M   
5  AE23B020  MANISH SARAVANAKUMAR        NaN       M   
6  AE23B029          SARAN RAAM S        NaN       M   
7  AE23B034            DEV MANDAL        NaN       M   
8  AE23B036        CHINMAY BORKER        NaN       M   
9  AE23B037    HARSH VARDHAN JENA        NaN       M   

                          4             5         6     7         8   \
0              Student Email  Program Name  Semester  Slot  CourseNo   
1  ae23b005@smail.iitm.ac.in        B.Tech        02     F    ID2090   
2  ae23b010@smail.iitm.ac.in        B.Tech        02     F    ID2090   
3  ae23b011@smail.iitm.ac.in        B.Tech        02     F    ID2090   
4  ae23b013@smail.iitm.

You can copy the data from dataframe to ndarray.

In [5]:
names = df.to_numpy()

In [6]:
type(names)

numpy.ndarray

In [7]:
names.shape

(125, 11)

In [8]:
names[0:5,1]

array(['First Name', 'BHAVESH AGARWAL', 'GUHAAN K', 'HEMANT MEENA',
       'KISHOREKUMAR S'], dtype=object)

You can insert a column into the dataframe. Before that, make sure you have the column of same length to avoid mistmatch. Here, we use random numbers scaled to be between 0 and 40 as the column named "es".

In [9]:
es = 40*np.random.rand(len(names))
es.shape

(125,)

In [10]:
df.insert(2, "es", es, True)

In [11]:
print(df)

            0                           1         es          2       3  \
0      RollNo                  First Name   6.008064  Last Name  Gender   
1    AE23B005             BHAVESH AGARWAL  35.244377        NaN       M   
2    AE23B010                    GUHAAN K  33.056267        NaN       M   
3    AE23B011                HEMANT MEENA  28.202288        NaN       M   
4    AE23B013              KISHOREKUMAR S  23.291066        NaN       M   
..        ...                         ...        ...        ...     ...   
120  MM23B050        MENON KARTHIK RAKESH   7.141023        NaN       M   
121  MM23B054  PAITHANKAR ATHARVA PRAKASH  28.391332        NaN       M   
122  MM23B058                    RAGHAV A  16.285036        NaN       M   
123  MM23B060         SAACHI PRAVIN AHIRE  22.139227        NaN       F   
124  MM23B062                  SASTIKAA S   4.365221        NaN       F   

                             4             5         6     7         8  \
0                Student 

You can save the dataframe to a csv file in the present working directory of the notebook. You can use a relative path but make sure you have write permissions to that folder.

Uncomment (i.e., remove the # character at the beginning of) the following line and try it in your notebook on your computer. Commenting this cell so that it will render on github.

In [12]:
#df.to_csv("marks.csv", index=False)

Use the ! to prefix any command you want to pass to the shell for execution. Uncomment the following line and try it in your notebook on your computer. Commenting this cell so that it will render on github.

In [13]:
#!ls -l marks.csv