<a href="https://colab.research.google.com/github/ShirsaM/My-Google-Colab/blob/main/Pandas_Exercise_2_Indexing%2C_Selecting_%26_Assigning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In Python, we can access the property of an object by accessing it as an attribute.
Example:- `reviews.country`
where reviews ia a DataFrame & country is a property inside it.

> OR

We can access its values using the indexing ([]) operator.
Example:- `reviews['country']` 

## Indexing in pandas

Pandas has its own accessor operators, `loc` and `iloc`

There are 2 types of Selection:- 

### **1. Index-based selection**

selecting data based on its numerical position in the data. `iloc` follows this paradigm.

In [None]:
# To select the first row of data in a DataFrame
reviews.iloc[0]

# To get a column with iloc
reviews.iloc[: , 0]

Both loc and iloc are row-first, column-second.

On its own, the : operator, which also comes from native Python, means "everything".

In [None]:
#  to select just the second and third entries of a column, we would do
reviews.iloc[1:3 , 0]

# It's also possible to pass a list:
reviews.iloc[[0,1,2] , 0]

# negative numbers can be used in selection. This will start counting forwards from the end of the values.
reviews.iloc[-5:]         #-------------------------> gives the last five elements of the dataset.


### **2. Label-based selection**
The second paradigm for attribute selection is the one followed by the loc operator: label-based selection. In this paradigm, it's the data index value, not its position, which matters.

In [None]:
# to get the first entry in reviews, we would now do the following:
reviews.loc[0 , 'country']

# below code gives all rows & only thye mentioned 3 column entries
reviews.loc[:, ['taster_name', 'taster_twitter_handle', 'points']]

**Note:-** `iloc` uses the Python stdlib indexing scheme, where the first element of the range is included and the last one excluded. So `0:10` will select entries 0,...,9. `loc`, meanwhile, indexes inclusively. So `0:10` will select entries 0,...,10.

## Manipulating the index
The set_index() method can be used to manipulate the index in any way we see fit.

Example:- reviews.set_index("title")

## Conditional selection

In [None]:
# checking if each wine is Italian or not
reviews.country == 'Italy'

# This operation produced a Series of True/False booleans

Pandas comes with a few built-in conditional selectors, two of which we will highlight here.

**The first is `isin`. isin is lets you select data whose value "is in" a list of values.**

**The second is `isnull` (and its companion notnull). These methods let you highlight values which are (or are not) empty (NaN).**

In [None]:
# For example of isin , here's how we can use it to select wines only from Italy or France:
reviews.loc[reviews.country.isin (["Italy" , "France"])]

# For example of isnull, to filter out wines lacking a price tag in the dataset, here's what we would do:
reviews.loc[reviews.price.notnull()]


## Assigning data
assigning data to a DataFrame is easy. You can assign either a constant value:

Example:- reviews['critic'] = 'everyone'

Or with an iterable of values:

Example:- reviews['index_backwards'] = range(len(reviews), 0, -1)



---



In [2]:
import pandas as pd

In [None]:
# Select the description column from reviews and assign the result to the variable desc.
desc = reviews.loc[: , "description"]
# OR
desc = reviews.description
# OR
desc = reviews["description"]

In [None]:
# Select the first value from the description column of reviews, assigning it to variable first_description.
first_description = reviews.iloc[0,1]
​# OR
first_description = reviews.description.iloc[0]

In [None]:
# Select the first row of data (the first record) from reviews, assigning it to the variable first_row.
first_row = reviews.iloc[0]

In [None]:
# Select the first 10 values from the description column in reviews, assigning the result to variable first_descriptions.
first_descriptions = reviews.iloc[:10 , 1]
# OR
first_descriptions = reviews.description.iloc[:10]


In [None]:
# Select the records with index labels 1, 2, 3, 5, and 8, assigning the result to the variable sample_reviews.
sample_reviews = reviews.iloc[[1,2,3,5,8] , :]


In [None]:
# Create a variable df containing the country, province, region_1, and region_2 columns of the records with the index labels 0, 1, 10, and 100. In other words, generate the following DataFrame:
df = reviews.loc[[0,1,10,100] , ["country" , "province" , "region_1" , "region_2"]]


In [None]:
# Create a variable df containing the country and variety columns of the first 100 records.
df = reviews.loc[:99 , ["country" , "variety"]]


In [None]:
# Create a DataFrame italian_wines containing reviews of wines made in Italy. Hint: reviews.country equals what?
italian_wines = reviews[reviews.country == "Italy"]

In [None]:
# Create a DataFrame top_oceania_wines containing all reviews with at least 95 points (out of 100) for wines from Australia or New Zealand.
top_oceania_wines = reviews.loc[(reviews.country.isin(["Australia" , "New Zealand"])) & (reviews.points >= 95)]