Remember that in Jupyter Notebook, the cells containing <br>
source code must be run from top to bottom order.

Otherwise, cells could generate errors when attempted to be executed.

Pandas is a Python library used for working with data sets. It has <br>
functions for analyzing, cleaning, exploring, and manipulating data.

A DataFrame is designed to store two-dimensional, size-mutable, <br>
potentially heterogeneous tabular data.

Documentation for pandas.DataFrame:

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html

In [2]:
import pandas as pd

In [3]:
'''
The first argument, is a list and its data will be stored
stored by the DataFrame: [[1, 2], [4, 5], [7, 8]]

The second argument is a list of names that will be given
to the indexes in the DataFrame.

The third argument is a list of name that will be given to
the columns in the DataFrame.
'''
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
                  index=['cobra', 'viper', 'sidewinder'],
                  columns=['max_speed', 'shield'])

The pandas.DataFrame.loc property is utilised to access a <br>
group of rows and columns by label(s) or a boolean array.

Display or output the "df" DataFrame:

In [4]:
df

Unnamed: 0,max_speed,shield
cobra,1,2
viper,4,5
sidewinder,7,8


Access the data in the "viper" row through the "viper" label.

The row in this context, is returned as a Series.

In [5]:
df.loc['viper']

max_speed    4
shield       5
Name: viper, dtype: int64

Passing a list of labels to the loc property will return a DataFrame:

In [6]:
df.loc[['viper', 'sidewinder']]

Unnamed: 0,max_speed,shield
viper,4,5
sidewinder,7,8


To obtain the value of the data in a particular row and column a <br>
a single label can be passed for both a column name and a row . <br>
"cobra" is the row name and "shield" is the column name. The value stored
in that row and column is 2.

The pandas.Dataframe.loc property or attribute makes it clear that a group <br>
of rows and columns will be accessed by label(s) or a boolean array.

Select the single value in row "cobra" and column "shield".

In [8]:
df.loc['cobra', 'shield']

np.int64(2)

Slice with labels for row and single label for column. <br>

Extract the values in the "max_speed" column for rows "cobra" to "sidewinder".

In [11]:
df.loc['cobra':'sidewinder', 'max_speed']

cobra         1
viper         4
sidewinder    7
Name: max_speed, dtype: int64

Extract all of the rows containing a "shield" value greater than six.

In [15]:
df.loc[df['shield'] > 6]

Unnamed: 0,max_speed,shield
sidewinder,7,8


Extract all of the rows containing a "shield" value greater <br>
than six, but only output the "max_speed" column.

In [17]:
df.loc[df['shield'] > 6, ['max_speed']]

Unnamed: 0,max_speed
sidewinder,7


Extract the rows whose values in the "max_speed" are greater <br>
than 1 and whose values in the "shield" column are less than 8.

& simply means "and" in the condition.

Notice that each condition is wrapped in parentheses ().

In [19]:
df.loc[(df['max_speed'] > 1) & (df['shield'] < 8)]

Unnamed: 0,max_speed,shield
viper,4,5


Extract the rows whose values in the "max_speed" are greater <br>
than 4 or and whose values in the "shield" column are less than 5.

| simply means "or" in the condition.

In [21]:
df.loc[(df['max_speed'] > 4) | (df['shield'] < 5)]

Unnamed: 0,max_speed,shield
cobra,1,2
sidewinder,7,8


In the "shield" column, set 50 as the value for <br>
the rows named "viper" and "sidewinder".

In [22]:
df.loc[['viper', 'sidewinder'], ['shield']] = 50

See the updated DataFrame.

In [23]:
df

Unnamed: 0,max_speed,shield
cobra,1,2
viper,4,50
sidewinder,7,50


Set all columns of the row "cobra" to a value of 10.

In [28]:
df.loc['cobra'] = 10

In [29]:
df

Unnamed: 0,max_speed,shield
cobra,10,10
viper,4,50
sidewinder,7,50


Remember that slicing expressions select a range of characters from a string.

The following is the format for slicing a string variable in Python: <br>
string[start : end] <br>

[:] will select the entire range of characters from a string.

In [44]:
city_name = 'Zurich'

# city_name = city_name.upper()
city_name = city_name[:].upper()

city_name

'ZURICH'

Set all of the rows in the "max_speed" column to a value of 30.

In [34]:
df.loc[:, 'max_speed'] = 30

In [35]:
df

Unnamed: 0,max_speed,shield
cobra,30,10
viper,30,50
sidewinder,30,50


Assign a value of 0 to all of the rows containing <br>
a "shield" value that is greater than 35.

In [38]:
df.loc[df['shield'] > 35] = 0

In [39]:
df

Unnamed: 0,max_speed,shield
cobra,30,10
viper,0,0
sidewinder,0,0
