# Exploring DataFrames with Currency Data

## Introduction
Welcome to the project on exploring currency data using pandas DataFrame! In this project, we will embark on an exciting journey of analyzing and gaining insights from a comprehensive dataset containing information about different currencies from around the world.

Throughout this project, we will focus on practicing fundamental techniques on pandas DataFrames to extract meaningful information and perform various analyses. We will work with a dataset that includes currency names, symbols, currency codes, countries of usage, and other relevant details.

Our exploration will cover several essential topics. Firstly, we will master basic navigation techniques using methods like head(), tail(), info(), shape, describe(), nunique(), and isnull(). These methods will allow us to quickly inspect the dataset, understand its structure, and identify any missing values.

Next, we will learn different methods for selecting specific columns from the DataFrame. Using indexing with [], loc[], and iloc[], we will extract and analyze specific attributes of the currencies. This selective approach will enable us to focus on the aspects that interest us the most and perform targeted analyses.

Lastly, we will delve into slicing techniques to basic filter the DataFrame. Slicing will allow us to extract subsets of data and perform more detailed and granular analyses. By applying these slicing techniques, we can gain insights into specific groups of currencies or examine currencies based on their properties.

Through this project, you will gain hands-on experience with pandas DataFrames and develop a solid foundation in data exploration, manipulation, and analysis. By working with a real-world currency dataset, you will enhance your skills in working with diverse datasets and be well-prepared for future data science projects.

Get ready to embark on an exciting journey of exploring currency data with pandas DataFrame. Let's dive in and unlock the insights hidden within the world of currencies!

In [1]:
import pandas as pd 
path_to_csv = "../../data/currencies.csv"
df = pd.read_csv(path_to_csv)
df.head()

Unnamed: 0,Name,Symbol,Code,Countries,Digits,Number
0,UK Pound,£,GBP,"Guernsey,Isle Of Man,Jersey,United Kingdom Of ...",2.0,826.0
1,Czech Koruna,Kč,CZK,Czechia,2.0,203.0
2,Latvian Lat,Ls,LVL,,,
3,Swiss Franc,CHF,CHF,"Liechtenstein,Switzerland",2.0,756.0
4,Croatian Kruna,kn,HRK,Croatia,2.0,191.0


## Activity

### 1. Getting an Overview of the dataset

Select all the correct answers.


- Fourth currency in the dataset is the Euro. ❌


- First currency in the dataset is the UK Pound. ✔


- Fourth currency in the dataset is the Swiss Franc. ✔


- First currency in the dataset is the US Dollar. ❌

### 2. Select the correct answers for dataset information


- Symbol column dtype is float64. ❌


- Symbol column dtype is object. ✔


- Digits column dtype is int64. ❌


- Digits column dtype is float64. ✔

In [2]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 34 entries, 0 to 33
Data columns (total 6 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Name       34 non-null     object 
 1   Symbol     34 non-null     object 
 2   Code       34 non-null     object 
 3   Countries  32 non-null     object 
 4   Digits     32 non-null     float64
 5   Number     32 non-null     float64
dtypes: float64(2), object(4)
memory usage: 1.7+ KB


### 3. Get the statistics of the dataset


- Mean of Digits column is 2.875. ❌


- Standard deviation of Digits column is 1.491869. ❌


- Mean of Digits column is 1.875000. ✔


- Standard deviation of Digits column is 0.491869. ✔

In [3]:
df['Digits'].describe()

count    32.000000
mean      1.875000
std       0.491869
min       0.000000
25%       2.000000
50%       2.000000
75%       2.000000
max       2.000000
Name: Digits, dtype: float64

### 4. Count the number of unique currencies in the dataset

In [4]:
df['Name'].nunique()

34

### 5. Identify the number of missing values in each column

Write the count of all missing values in the dataset.

In [7]:
df.isnull().sum()

Name         0
Symbol       0
Code         0
Countries    2
Digits       2
Number       2
dtype: int64

- isnull() method returns a boolean DataFrame indicating whether each value is missing or not. The sum() method sums the number of missing values in each column.
- We can see that the Countries, Digits, and Number columns have 2 missing values each. So the total number of missing values in the dataset is 6.

### 6. Determine the highest currency number in the dataset

Write answer in the form of a float. For example, if the answer is 98, write 98.0.

In [9]:
df['Number'].max()

986.0

### 7. Select Currency Names

Extract the 'Name' column from the DataFrame df and assign it to a new variable names.

In [10]:
names = df['Name']

### 8. Get the details of the 3rd row

Extract the 3rd row from the DataFrame df and assign it to a new variable row_3.

In [11]:
row_3 = df.iloc[2]

### 9. Select rows 10 to 15 (inclusive) from the DataFrame

Select rows 10 to 15 (inclusive) from the DataFrame df and assign it to a new variable rows.

> You were asked for rows, not index numbers.

In [13]:
rows = df.iloc[9:15]
rows

Unnamed: 0,Name,Symbol,Code,Countries,Digits,Number
9,Hungarian Forint,Ft,HUF,Hungary,2.0,348.0
10,Brazilian Real,R$,BRL,Brazil,2.0,986.0
11,Lithuanian Litas,Lt,LTL,,,
12,Bulgarian Lev,лB,BGN,Bulgaria,2.0,975.0
13,Polish Zloty,zł,PLN,Poland,2.0,985.0
14,US Dollar,$,USD,"American Samoa,Bonaire, Sint Eustatius And Sab...",2.0,840.0


### 10. Extract Alternating Rows from DataFrame

Your task is to extract the alternating rows from the DataFrame df, where "alternating" refers to the 1st, 3rd, 5th, and so on (not based on index order). Store these selected rows in a new variable named rows_every_other.

In [15]:
rows_every_other = df.iloc[::2]
rows_every_other

Unnamed: 0,Name,Symbol,Code,Countries,Digits,Number
0,UK Pound,£,GBP,"Guernsey,Isle Of Man,Jersey,United Kingdom Of ...",2.0,826.0
2,Latvian Lat,Ls,LVL,,,
4,Croatian Kruna,kn,HRK,Croatia,2.0,191.0
6,Korean Won,₩,KRW,Korea (The Republic Of),0.0,410.0
8,Turkish Lira,₤,TRY,Turkey,2.0,949.0
10,Brazilian Real,R$,BRL,Brazil,2.0,986.0
12,Bulgarian Lev,лB,BGN,Bulgaria,2.0,975.0
14,US Dollar,$,USD,"American Samoa,Bonaire, Sint Eustatius And Sab...",2.0,840.0
16,Japanese Yen,¥,JPY,Japan,0.0,392.0
18,Norwegian Krone,kr,NOK,"Bouvet Island,Norway,Svalbard And Jan Mayen",2.0,578.0


### 11. Select columns with indices 2, 4, and 5 from the DataFrame

Select columns with indices 2, 4, and 5 from the DataFrame df and assign it to a new variable cols.

In [16]:
cols = df.iloc[:,[2, 4, 5]]

### 12. elect the first three columns of the dataframe

Select the first three columns of the dataframe df and assign it to a new variable cols_first_three.

> First three columns also include the index column.

The expected output looks like this (it contains more rows than below):

Name    Symbol
0   UK Pound    £
1   Czech Koruna    Kč
2   Latvian Lat Ls
3   Swiss Franc CHF
4   Croatian Kruna  kn
5   Danish Krone    kr

In [18]:
cols_first_three = df.iloc[:,:2]
cols_first_three

Unnamed: 0,Name,Symbol
0,UK Pound,£
1,Czech Koruna,Kč
2,Latvian Lat,Ls
3,Swiss Franc,CHF
4,Croatian Kruna,kn
5,Danish Krone,kr
6,Korean Won,₩
7,Swedish Krona,kr
8,Turkish Lira,₤
9,Hungarian Forint,Ft
