# Loading Data in pandas
> In this chapter, you'll learn a powerful Python libary - pandas. Pandas lets you read, modify, and search tabular datasets (like spreadsheets and database tables). You'll examine credit card records for the suspects and see if any of them made suspicious purchases. This is the Summary of lecture "Introduction to Data Science in Python", via datacamp.
- toc: true 
- badges: true
- comments: true
- author: Chanseok Kang
- categories: [Python, Datacamp, Data_Science]
- image: 

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## What is pandas?
- Loading tabular data from different sources
- Search for particular rows or columns
- Calculate aggregate statistics

### Loading a DataFrame
We're still working hard to solve the kidnapping of Bayes, the Golden Retriever. Previously, we used a license plate spotted at the crime scene to narrow the list of suspects to:

- Fred Frequentist
- Ronald Aylmer Fisher
- Gertrude Cox
- Kirstine Smith

We've obtained credit card records for all four suspects. Perhaps some of them made suspicious purchases before the kidnapping?

In [4]:
# Load the CSV "credit_records.csv"
credit_records = pd.read_csv('./dataset/credit_records.csv')

# Display the first five rows of credit_records using the .head() method
credit_records.head(5)

Unnamed: 0,suspect,location,date,item,price
0,Kirstine Smith,Groceries R Us,"January 6, 2018",broccoli,1.25
1,Gertrude Cox,Petroleum Plaza,"January 6, 2018",fizzy drink,1.9
2,Fred Frequentist,Groceries R Us,"January 6, 2018",broccoli,1.25
3,Gertrude Cox,Groceries R Us,"January 12, 2018",broccoli,1.25
4,Kirstine Smith,Clothing Club,"January 9, 2018",shirt,14.25


### Inspecting a DataFrame
We've loaded the credit card records of our four suspects into a DataFrame called `credit_records`. Let's learn more about the structure of this DataFrame.

In [7]:
# Use .info() to inspect the DataFrame credit_records
print(credit_records.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 104 entries, 0 to 103
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   suspect   104 non-null    object 
 1   location  104 non-null    object 
 2   date      104 non-null    object 
 3   item      104 non-null    object 
 4   price     104 non-null    float64
dtypes: float64(1), object(4)
memory usage: 4.2+ KB
None


## Selecting columns


### Two methods for selecting columns
Once again, we've loaded the credit card records of our four suspects into a DataFrame called `credit_records`. Let's examine the items that they've purchased.



In [8]:
# Select the column item from credit_records
# Use brackets and string notation
items = credit_records['item']

# Display the results
print(items)

0         broccoli
1      fizzy drink
2         broccoli
3         broccoli
4            shirt
          ...     
99           shirt
100          pants
101          dress
102         burger
103      cucumbers
Name: item, Length: 104, dtype: object


In [9]:
# Select the column item from credit_records
# Use dot notation
items = credit_records.item

# Display the results
print(items)

0         broccoli
1      fizzy drink
2         broccoli
3         broccoli
4            shirt
          ...     
99           shirt
100          pants
101          dress
102         burger
103      cucumbers
Name: item, Length: 104, dtype: object


## Selecting rows with logic


### Logical testing
Let's practice writing logical statements and displaying the output.

Recall that we use the following operators:

- `==` tests that two values are equal.
- `!=` tests that two values are not equal.
- `>` and `<` test that greater than or less than, respectively.
- `>=` and `<=` test greater than or equal to or less than or equal to, respectively.