### How-to use the library ocifs to read a CSV file from Object Storage

In this short example we're showing how to read a CSV file from **OCI Object Storage** using Oracle library **OCIFS**

* using Resource Principal
* result as a Pandas DataFrame

In [1]:
import ocifs
import pandas as pd

we're showing two options: 
* one using directly Pandas read_csv, ocifs is under the scene
* two: using directly ocifs

In [2]:
# functions definition (for option 2)

#
# reads a CSV file from Object Storage into a Pandas DataFrame
#
def read_from_object_storage(prefix, file_name):
    # get access to OSS as an fs
    
    # config={} assume RESOURCE PRINCIPAL authorization
    # RP needs to be configured
    fs = ocifs.OCIFileSystem(config={})
    
    FILE_PATH = prefix + file_name
    
    print('Reading using file_path:', FILE_PATH)
    
    # reading data from Object Storage
    with fs.open(FILE_PATH, 'rb') as f:
        #
        # This is an example, in general you should set appropiate parameters.
        # nothing to do with ocifs
        # for encoding, here there is an example with 8859, but could be utf... set as appropriate
        # see pandas documentation
        #
        df = pd.read_csv(f, delimiter=',', encoding = "ISO-8859-1")
    
    return df

In [3]:
# the easiest way (see: https://github.com/oracle/ocifs)

# the encoding is due to the fact that the files is not utf
# storage options is {} because we're using Resource Principal
# othewrise you have to specify path con config

races_df = pd.read_csv(
    "oci://data-best-practices@frqap2zhtzbe/RACES.csv",
    storage_options={}, encoding = "ISO-8859-1")

races_df.head()

Unnamed: 0,RACEID,YEAR,ROUND,NAME,F1DATE,TIME,URL,SCORE,DNF_COUNT,DNF_DUE_TO_ACCIDENT_COUNT,WEATHER,WEATHER_WET,CIRCUITREF,YEAR_C,RACE_COUNT,NAME_YEAR,OVERTAKEN_POSITIONS_TOTAL
0,25,2008,8,French Grand Prix,22-JUN-08,12:00:00,http://en.wikipedia.org/wiki/2008_French_Grand...,5.548,1,1,Dry at first; light rain in the final stages,Y,magny_cours,2008,1,2008 French Grand Prix,169
1,27,2008,10,German Grand Prix,20-JUL-08,12:00:00,http://en.wikipedia.org/wiki/2008_German_Grand...,7.18,3,1,"Cloudy, later sunny[1]",N,hockenheimring,2008,1,2008 German Grand Prix,181
2,23,2008,6,Monaco Grand Prix,25-MAY-08,12:00:00,http://en.wikipedia.org/wiki/2008_Monaco_Grand...,8.177,6,5,"Wet, drying later.",Y,monaco,2008,1,2008 Monaco Grand Prix,147
3,9,2009,9,German Grand Prix,12-JUL-09,12:00:00,http://en.wikipedia.org/wiki/2009_German_Grand...,7.096,2,0,Sunny and overcast,N,nurburgring,2009,1,2009 German Grand Prix,236
4,6,2009,6,Monaco Grand Prix,24-MAY-09,12:00:00,http://en.wikipedia.org/wiki/2009_Monaco_Grand...,5.504,6,5,Sunny,N,monaco,2009,1,2009 Monaco Grand Prix,92


In [4]:
# or, if you want more control on the code, use the function defined previoulsy
# The format for prefix is oci://<bucket_name>@<namespace name>

PREFIX = "oci://data-best-practices@frqap2zhtzbe/"

# put your file name here
FILE_NAME = "RACES.csv"

# see in functions above
races_df = read_from_object_storage(prefix=PREFIX, file_name=FILE_NAME)

Reading using file_path: oci://data-best-practices@frqap2zhtzbe/RACES.csv


In [5]:
races_df.head()

Unnamed: 0,RACEID,YEAR,ROUND,NAME,F1DATE,TIME,URL,SCORE,DNF_COUNT,DNF_DUE_TO_ACCIDENT_COUNT,WEATHER,WEATHER_WET,CIRCUITREF,YEAR_C,RACE_COUNT,NAME_YEAR,OVERTAKEN_POSITIONS_TOTAL
0,25,2008,8,French Grand Prix,22-JUN-08,12:00:00,http://en.wikipedia.org/wiki/2008_French_Grand...,5.548,1,1,Dry at first; light rain in the final stages,Y,magny_cours,2008,1,2008 French Grand Prix,169
1,27,2008,10,German Grand Prix,20-JUL-08,12:00:00,http://en.wikipedia.org/wiki/2008_German_Grand...,7.18,3,1,"Cloudy, later sunny[1]",N,hockenheimring,2008,1,2008 German Grand Prix,181
2,23,2008,6,Monaco Grand Prix,25-MAY-08,12:00:00,http://en.wikipedia.org/wiki/2008_Monaco_Grand...,8.177,6,5,"Wet, drying later.",Y,monaco,2008,1,2008 Monaco Grand Prix,147
3,9,2009,9,German Grand Prix,12-JUL-09,12:00:00,http://en.wikipedia.org/wiki/2009_German_Grand...,7.096,2,0,Sunny and overcast,N,nurburgring,2009,1,2009 German Grand Prix,236
4,6,2009,6,Monaco Grand Prix,24-MAY-09,12:00:00,http://en.wikipedia.org/wiki/2009_Monaco_Grand...,5.504,6,5,Sunny,N,monaco,2009,1,2009 Monaco Grand Prix,92
