
# Dictionaries in Python

## Objectives

After completing this lab you will be able to:

*  Create a Dictionary and perform operations on the Dictionary


## Table of Contents
<div class="alert alert-block alert-info" style="margin-top: 20px">
  
1. [What is Dictionaries?](#1)
    
    1.1 [Create a Dictionary and access the elements](#1.1)
</div>

<hr>


<a id="dic"></a>
## Dictionaries


<a id=1></a>
## What are Dictionaries?


A dictionary consists of keys and values. It is helpful to compare a dictionary to a list. Instead of being indexed numerically like a list, dictionaries have keys. These keys are the keys that are used to access values within a dictionary.   


<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%202/images/DictsList.png" width="400">


### Dictionary Methods:

1. **dict.keys()** - Returns a list of all the keys in the dictionary.

2. **dict.values()** - Returns a list of all the values in the dictionary.

3. **dict.items()** - Returns a list of key-value pairs (tuples) in the dictionary.

3. **dict.get(key)** - Returns the value associated with the given key. If the key is not found, it returns a default value.

4. **dict.pop(key)** - Removes the key and its associated value from the dictionary and returns the value.

5. **dict.popitem()** - Removes and returns an arbitrary key-value pair from the dictionary.

6. **dict.update(other_dict)** - Updates the dictionary with key-value pairs from another dictionary or iterable.

7. **dict.clear()** - Removes all items from the dictionary.

8. **dict.copy()** - Returns a shallow copy of the dictionary.

9. **dict.fromkeys(keys, value)** - Creates a new dictionary with the specified keys and a common value.
<hr>

### Common Dictionary Functions:

1. **len(dictionary)** - Returns the number of key-value pairs in the dictionary.

2. **key in dictionary** - Checks if a key exists in the dictionary.

3. **key not in dictionary** - Checks if a key does not exist in the dictionary.

4. **sorted(dictionary)** - Returns a sorted list of keys in the dictionary.

5. **max(dictionary)** - Returns the key with the maximum value.

6. **min(dictionary)** - Returns the key with the minimum value.

<a id=1.1></a>
### Create a Dictionary and access the elements


In [11]:
sample_dict = {'apple': 3, 'banana': 5, 'cherry': 2}

In [12]:
sample_dict

{'apple': 3, 'banana': 5, 'cherry': 2}

In [13]:
numbers = [1, 2, 3,4]
numbers[2]

3

In [14]:
sample_dict['apple']

3

In [15]:
sample_dict['banana']

5

In [16]:
sample_dict['cherry']

2

In [17]:
my_dict = {
    "name": "John",
    "age": 30,
    "city": "New York"
}

In [18]:
my_dict

{'name': 'John', 'age': 30, 'city': 'New York'}

In [19]:
my_dict.keys()

dict_keys(['name', 'age', 'city'])

In [20]:
my_dict.values()

dict_values(['John', 30, 'New York'])

In [21]:
print("Name:", my_dict["name"])
print("Age:", my_dict["age"])
print("City:", my_dict["city"])

Name: John
Age: 30
City: New York


In [22]:
my_dict.keys()

dict_keys(['name', 'age', 'city'])

In [23]:
my_dict.values()

dict_values(['John', 30, 'New York'])

In [24]:
my_dict.items()

dict_items([('name', 'John'), ('age', 30), ('city', 'New York')])

In [25]:
my_dict2 = {
    "name": ["John", 'Michel', 'Jeck', 'Sucy'],
    'gender':['M', 'M', 'F', 'F'],
    "age":[30, 23, 32, 43],
    "city": ["New York", 'pp', 'sv', 'tk']
}

In [26]:
my_dict2

{'name': ['John', 'Michel', 'Jeck', 'Sucy'],
 'gender': ['M', 'M', 'F', 'F'],
 'age': [30, 23, 32, 43],
 'city': ['New York', 'pp', 'sv', 'tk']}

In [27]:
my_dict2.keys()

dict_keys(['name', 'gender', 'age', 'city'])

In [28]:
my_dict2.values()

dict_values([['John', 'Michel', 'Jeck', 'Sucy'], ['M', 'M', 'F', 'F'], [30, 23, 32, 43], ['New York', 'pp', 'sv', 'tk']])

In [29]:
my_dict2.items()

dict_items([('name', ['John', 'Michel', 'Jeck', 'Sucy']), ('gender', ['M', 'M', 'F', 'F']), ('age', [30, 23, 32, 43]), ('city', ['New York', 'pp', 'sv', 'tk'])])

In [49]:
for key, value in my_dict.items():
    print(f'My keys: {key} my value: {value}.')

My keys: name my value: ['John', 'Michel', 'Jeck', 'Sucy'].
My keys: gender my value: ['M', 'M', 'F', 'F'].
My keys: age my value: [30, 23, 32, 43].
My keys: city my value: ['New York', 'pp', 'sv', 'tk'].


In [50]:
for i in my_dict:
    print(f'My key is: {i}')

My key is: name
My key is: gender
My key is: age
My key is: city


In [30]:
import pandas as pd

In [31]:
df = pd.DataFrame(my_dict2)
df

Unnamed: 0,name,gender,age,city
0,John,M,30,New York
1,Michel,M,23,pp
2,Jeck,F,32,sv
3,Sucy,F,43,tk


In [32]:
df

Unnamed: 0,name,gender,age,city
0,John,M,30,New York
1,Michel,M,23,pp
2,Jeck,F,32,sv
3,Sucy,F,43,tk


In [33]:
df.to_csv('students', index=False)
# save file to csv file 

In [34]:
df.to_excel('Student.xlsx', index=False)

In [36]:
# loading data set
df1 = pd.read_csv('sutdents')
df1

Unnamed: 0,name,gender,age,city
0,John,M,30,New York
1,Michel,M,23,pp
2,Jeck,F,32,sv
3,Sucy,F,43,tk


In [36]:
df2 = pd.read_excel('Student.xlsx')
df2

Unnamed: 0,name,gender,age,city
0,John,M,30,New York
1,Michel,M,23,pp
2,Jeck,F,32,sv
3,Sucy,F,43,tk


In [37]:
df2

Unnamed: 0,name,gender,age,city
0,John,M,30,New York
1,Michel,M,23,pp
2,Jeck,F,32,sv
3,Sucy,F,43,tk


In [38]:
df2

Unnamed: 0,name,gender,age,city
0,John,M,30,New York
1,Michel,M,23,pp
2,Jeck,F,32,sv
3,Sucy,F,43,tk


In [39]:
df2.age

0    30
1    23
2    32
3    43
Name: age, dtype: int64

In [40]:
df2['age']

0    30
1    23
2    32
3    43
Name: age, dtype: int64

In [41]:
df2.city

0    New York
1          pp
2          sv
3          tk
Name: city, dtype: object

In [40]:
df1.name

0      John
1    Michel
2      Jeck
3      Sucy
Name: name, dtype: object

In [39]:
df1['name']

0      John
1    Michel
2      Jeck
3      Sucy
Name: name, dtype: object

In [38]:
df1.describe()

Unnamed: 0,age
count,4.0
mean,32.0
std,8.286535
min,23.0
25%,28.25
50%,31.0
75%,34.75
max,43.0


In [37]:
df1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   name    4 non-null      object
 1   gender  4 non-null      object
 2   age     4 non-null      int64 
 3   city    4 non-null      object
dtypes: int64(1), object(3)
memory usage: 256.0+ bytes


In [43]:
df2.dtypes

name      object
gender    object
age        int64
city      object
dtype: object

In [42]:
df2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   name    4 non-null      object
 1   gender  4 non-null      object
 2   age     4 non-null      int64 
 3   city    4 non-null      object
dtypes: int64(1), object(3)
memory usage: 256.0+ bytes


### Exercise

In [None]:
{}
[]

<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%202/images/DictsStructure.png" width="400">


In [10]:
df = pd.read_csv('DataTest.csv')

Unnamed: 0,IP Address,Language,Purchase Price
0,149.146.147.205,el,98.14
1,15.160.41.51,fr,70.73
2,132.207.160.22,de,0.95
3,30.250.74.19,es,78.04
4,24.140.33.94,es,77.82
...,...,...,...
9995,29.73.197.114,it,82.21
9996,121.133.168.51,pt,25.63
9997,156.210.0.254,el,83.98
9998,55.78.26.143,es,38.84


1. What is the minimum purchase price?
2. What is the median purchase price?
3. What is the maximum purchase price?
4. What is the mean (average) purchase price?

In [44]:
url = 'https://raw.githubusercontent.com/ManonYa09/Data_analysis_with_pandas/main/Data/Ecommerce%20Purchases.csv'
# # Read the CSV file into a DataFrame
df = pd.read_csv(url)


In [46]:
df.head()

Unnamed: 0,Address,Lot,AM or PM,Browser Info,Company,Credit Card,CC Exp Date,CC Security Code,CC Provider,Email,Job,IP Address,Language,Purchase Price
0,"16629 Pace Camp Apt. 448\nAlexisborough, NE 77...",46 in,PM,Opera/9.56.(X11; Linux x86_64; sl-SI) Presto/2...,Martinez-Herman,6011929061123406,02/20,900,JCB 16 digit,pdunlap@yahoo.com,"Scientist, product/process development",149.146.147.205,el,98.14
1,"9374 Jasmine Spurs Suite 508\nSouth John, TN 8...",28 rn,PM,Opera/8.93.(Windows 98; Win 9x 4.90; en-US) Pr...,"Fletcher, Richards and Whitaker",3337758169645356,11/18,561,Mastercard,anthony41@reed.com,Drilling engineer,15.160.41.51,fr,70.73
2,Unit 0065 Box 5052\nDPO AP 27450,94 vE,PM,Mozilla/5.0 (compatible; MSIE 9.0; Windows NT ...,"Simpson, Williams and Pham",675957666125,08/19,699,JCB 16 digit,amymiller@morales-harrison.com,Customer service manager,132.207.160.22,de,0.95
3,"7780 Julia Fords\nNew Stacy, WA 45798",36 vm,PM,Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0 ...,"Williams, Marshall and Buchanan",6011578504430710,02/24,384,Discover,brent16@olson-robinson.info,Drilling engineer,30.250.74.19,es,78.04
4,"23012 Munoz Drive Suite 337\nNew Cynthia, TX 5...",20 IE,AM,Opera/9.58.(X11; Linux x86_64; it-IT) Presto/2...,"Brown, Watson and Andrews",6011456623207998,10/25,678,Diners Club / Carte Blanche,christopherwright@gmail.com,Fine artist,24.140.33.94,es,77.82


In [47]:
df.tail()

Unnamed: 0,Address,Lot,AM or PM,Browser Info,Company,Credit Card,CC Exp Date,CC Security Code,CC Provider,Email,Job,IP Address,Language,Purchase Price
9995,"966 Castaneda Locks\nWest Juliafurt, CO 96415",92 XI,PM,Mozilla/5.0 (Windows NT 5.1) AppleWebKit/5352 ...,Randall-Sloan,342945015358701,03/22,838,JCB 15 digit,iscott@wade-garner.com,Printmaker,29.73.197.114,it,82.21
9996,"832 Curtis Dam Suite 785\nNorth Edwardburgh, T...",41 JY,AM,Mozilla/5.0 (compatible; MSIE 9.0; Windows NT ...,"Hale, Collins and Wilson",210033169205009,07/25,207,JCB 16 digit,mary85@hotmail.com,Energy engineer,121.133.168.51,pt,25.63
9997,Unit 4434 Box 6343\nDPO AE 28026-0283,74 Zh,AM,Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_7...,Anderson Ltd,6011539787356311,05/21,1,VISA 16 digit,tyler16@gmail.com,Veterinary surgeon,156.210.0.254,el,83.98
9998,"0096 English Rest\nRoystad, IA 12457",74 cL,PM,Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_8;...,Cook Inc,180003348082930,11/17,987,American Express,elizabethmoore@reid.net,Local government officer,55.78.26.143,es,38.84
9999,"40674 Barrett Stravenue\nGrimesville, WI 79682",64 Hr,AM,Mozilla/5.0 (X11; Linux i686; rv:1.9.5.20) Gec...,Greene Inc,4139972901927273,02/19,302,JCB 15 digit,rachelford@vaughn.com,"Embryologist, clinical",176.119.198.199,el,67.59


In [49]:
df['AM or PM'].value_counts()

PM    5068
AM    4932
Name: AM or PM, dtype: int64

In [51]:
a = df['Purchase Price'].mean()

In [52]:
print('The minimum purchase price', a)

The minimum purchase price 50.347302


In [53]:
df['Purchase Price'].mode()

0    49.73
Name: Purchase Price, dtype: float64

In [54]:
df[df['Purchase Price']== 49.73]

Unnamed: 0,Address,Lot,AM or PM,Browser Info,Company,Credit Card,CC Exp Date,CC Security Code,CC Provider,Email,Job,IP Address,Language,Purchase Price
2704,"30172 Garcia Mill Apt. 679\nColemanbury, AS 22...",45 jq,AM,Mozilla/5.0 (compatible; MSIE 9.0; Windows NT ...,"Livingston, Thompson and Butler",4938154128312364,12/16,833,VISA 16 digit,mathewhudson@hotmail.com,Broadcast journalist,25.90.28.11,pt,49.73
2711,USNV Howard\nFPO AP 62057-9944,47 BF,PM,Mozilla/5.0 (iPod; U; CPU iPhone OS 4_1 like M...,Roberts Inc,30578966011370,12/22,7789,Maestro,hutchinsonbrenda@boyd.org,Materials engineer,205.151.250.111,en,49.73
3191,"1891 Reed Pass Apt. 710\nMichaelfurt, RI 41093...",58 Ba,AM,Mozilla/5.0 (Windows NT 5.1; en-US; rv:1.9.1.2...,Norris and Sons,3528752447068803,03/25,4547,Voyager,phyllishenderson@moore.org,"Copywriter, advertising",6.60.103.123,de,49.73
3810,"7016 Richard Center Apt. 216\nNew Paulburgh, C...",53 WU,AM,Mozilla/5.0 (Macintosh; PPC Mac OS X 10_8_4; r...,"Kim, Harris and Lee",3528700667851234,11/18,435,American Express,balljessica@yahoo.com,"Engineer, communications",33.68.160.144,zh,49.73
6264,"64939 Patton Track\nMartinezfort, NE 03634",25 VM,AM,Mozilla/5.0 (Macintosh; PPC Mac OS X 10_7_6) A...,"Murphy, Berry and Chambers",5281095092449452,02/26,943,Diners Club / Carte Blanche,sarawhite@stephenson-stephens.biz,Forest/woodland manager,111.83.146.153,el,49.73
8970,"48922 Bates Haven Suite 013\nWallacetown, MD 3...",69 Or,PM,Mozilla/5.0 (Windows NT 6.1; en-US; rv:1.9.0.2...,Valencia-Gomez,5343665582475153,10/25,261,VISA 16 digit,jessica77@bass.net,Translator,207.215.141.148,es,49.73
9056,"112 Smith Cliff Suite 009\nMclaughlinmouth, OK...",64 Wv,AM,Opera/9.45.(Windows 98; Win 9x 4.90; it-IT) Pr...,"Fowler, Howell and Stephens",6011914526251489,04/17,682,Mastercard,shawnpena@hotmail.com,Facilities manager,26.250.120.163,zh,49.73


In [56]:
df.isnull().sum()

Address             0
Lot                 0
AM or PM            0
Browser Info        0
Company             0
Credit Card         0
CC Exp Date         0
CC Security Code    0
CC Provider         0
Email               0
Job                 0
IP Address          0
Language            0
Purchase Price      0
dtype: int64

In [57]:
df.dtypes()

TypeError: 'Series' object is not callable

In [58]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   Address           10000 non-null  object 
 1   Lot               10000 non-null  object 
 2   AM or PM          10000 non-null  object 
 3   Browser Info      10000 non-null  object 
 4   Company           10000 non-null  object 
 5   Credit Card       10000 non-null  int64  
 6   CC Exp Date       10000 non-null  object 
 7   CC Security Code  10000 non-null  int64  
 8   CC Provider       10000 non-null  object 
 9   Email             10000 non-null  object 
 10  Job               10000 non-null  object 
 11  IP Address        10000 non-null  object 
 12  Language          10000 non-null  object 
 13  Purchase Price    10000 non-null  float64
dtypes: float64(1), int64(2), object(11)
memory usage: 1.1+ MB


In [3]:
df.to_csv('Dataecommerce.csv', index=False)

In [4]:
# df1 = pd.read_csv('Dataecommerce.csv')

In [5]:
# df1.head()

Unnamed: 0,Address,Lot,AM or PM,Browser Info,Company,Credit Card,CC Exp Date,CC Security Code,CC Provider,Email,Job,IP Address,Language,Purchase Price
0,"16629 Pace Camp Apt. 448\nAlexisborough, NE 77...",46 in,PM,Opera/9.56.(X11; Linux x86_64; sl-SI) Presto/2...,Martinez-Herman,6011929061123406,02/20,900,JCB 16 digit,pdunlap@yahoo.com,"Scientist, product/process development",149.146.147.205,el,98.14
1,"9374 Jasmine Spurs Suite 508\nSouth John, TN 8...",28 rn,PM,Opera/8.93.(Windows 98; Win 9x 4.90; en-US) Pr...,"Fletcher, Richards and Whitaker",3337758169645356,11/18,561,Mastercard,anthony41@reed.com,Drilling engineer,15.160.41.51,fr,70.73
2,Unit 0065 Box 5052\nDPO AP 27450,94 vE,PM,Mozilla/5.0 (compatible; MSIE 9.0; Windows NT ...,"Simpson, Williams and Pham",675957666125,08/19,699,JCB 16 digit,amymiller@morales-harrison.com,Customer service manager,132.207.160.22,de,0.95
3,"7780 Julia Fords\nNew Stacy, WA 45798",36 vm,PM,Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0 ...,"Williams, Marshall and Buchanan",6011578504430710,02/24,384,Discover,brent16@olson-robinson.info,Drilling engineer,30.250.74.19,es,78.04
4,"23012 Munoz Drive Suite 337\nNew Cynthia, TX 5...",20 IE,AM,Opera/9.58.(X11; Linux x86_64; it-IT) Presto/2...,"Brown, Watson and Andrews",6011456623207998,10/25,678,Diners Club / Carte Blanche,christopherwright@gmail.com,Fine artist,24.140.33.94,es,77.82


In [6]:
# df1.columns

Index(['Address', 'Lot', 'AM or PM', 'Browser Info', 'Company', 'Credit Card',
       'CC Exp Date', 'CC Security Code', 'CC Provider', 'Email', 'Job',
       'IP Address', 'Language', 'Purchase Price'],
      dtype='object')

In [7]:
# df = df1[['IP Address', 'Language', 'Purchase Price']]

In [9]:
# df.to_csv('DataTest', index=False)