# Data Manipulation with Pandas and Numpy
Data manipulation is the process of changing, organizing or transforming data to make it more useful, readable or suitable for analysis. It involves tasks like cleaning, filtering, sorting, grouping or calculating new values from existing data.

Tools like Pandas and Numpy, popular python libraries are often used to streamline these tasks. Pandas excels at handling structured data like tables of crypto trades, for filtering or grouping, while Numpy suppors fast nuerical computations, such as calculating average prices or returns.

For instance, a crypto investor might manipulate data by filtering trades to show only Bitcoin transactions over $5,000, sorting them by timestamp to track price movements and aggregating daily totals to assess trading volume. This helps uncover trends, optimize strategies and ensure data accuracy, ultimately supporting better decisions in fast-paced crpto market

### Pandas
Pandas is a popular python library used for workig with data. It helps you load, clean, analyze and manipulate data easily. Think of it like an Excel spreadsheet in Python! It's great for handling tables of data.
### Key Concepts
Series: A single colun of data (like a list with labels)
Index: Labels for rows, helping you identify and access data
Yo can load data from files (CSV, Excel, etc), manipulate it and save it back

### Installing / Importing Pandas

In [2]:
import pandas as pd 


# Create a DataFrame
A DataFrame is like a table with rows and columns. Let's create one with sample cryptocurrency data (e.g, coin names, prices and trade volumes)

In [9]:
# Create a Dictionary with Crypto Data

data = {
    'Coin': ['Bitcoin', 'Ethereum', 'Ripple', 'Litecoin'],
    'Price': [45000, 3000, 0.85, 120],
    'Volume': [15000, "nan", 200000, 50000]
}

df = pd.DataFrame(data)

print(df)

       Coin     Price  Volume
0   Bitcoin  45000.00   15000
1  Ethereum   3000.00     nan
2    Ripple      0.85  200000
3  Litecoin    120.00   50000


#### Explore Data


In [10]:
df

Unnamed: 0,Coin,Price,Volume
0,Bitcoin,45000.0,15000.0
1,Ethereum,3000.0,
2,Ripple,0.85,200000.0
3,Litecoin,120.0,50000.0


In [5]:
print(df.head(2))

       Coin    Price  Volume
0   Bitcoin  45000.0   15000
1  Ethereum   3000.0    8000


In [6]:
print(df.tail(2))

       Coin   Price  Volume
2    Ripple    0.85  200000
3  Litecoin  120.00   50000


In [11]:
print(df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Coin    4 non-null      object 
 1   Price   4 non-null      float64
 2   Volume  4 non-null      object 
dtypes: float64(1), object(2)
memory usage: 224.0+ bytes
None


### Basic Data Manipulation

In [13]:
# Select a Specific Column

print(df[['Coin', 'Price']])

       Coin     Price
0   Bitcoin  45000.00
1  Ethereum   3000.00
2    Ripple      0.85
3  Litecoin    120.00


In [14]:
print(df['Coin'])

0     Bitcoin
1    Ethereum
2      Ripple
3    Litecoin
Name: Coin, dtype: object


### Filtering Rows

In [15]:
print(df[df['Price'] > 100])

       Coin    Price Volume
0   Bitcoin  45000.0  15000
1  Ethereum   3000.0    nan
3  Litecoin    120.0  50000


### Sort Data
Sort by a column like Price in ascending or descending order

In [None]:
print(df.sort_values('Price'))  # Ascending Order

       Coin     Price  Volume
2    Ripple      0.85  200000
3  Litecoin    120.00   50000
1  Ethereum   3000.00     nan
0   Bitcoin  45000.00   15000


In [None]:
print(df.sort_values('Price', ascending=False))  # Sort in Descending Order

       Coin     Price  Volume
0   Bitcoin  45000.00   15000
1  Ethereum   3000.00     nan
3  Litecoin    120.00   50000
2    Ripple      0.85  200000


### Add New Column

In [20]:
df['Price_after_2Pct'] = df['Price'] * 1.02
print(df)

       Coin     Price  Volume  Price_after_2Pct
0   Bitcoin  45000.00   15000         45900.000
1  Ethereum   3000.00     nan          3060.000
2    Ripple      0.85  200000             0.867
3  Litecoin    120.00   50000           122.400


In [None]:
df.describe()

Unnamed: 0,Price,Price_after_2Pct
count,4.0,4.0
mean,12030.2125,12270.81675
std,22023.550649,22464.021662
min,0.85,0.867
25%,90.2125,92.01675
50%,1560.0,1591.2
75%,13500.0,13770.0
max,45000.0,45900.0


In [30]:
df.to_csv('crypto_data.csv', index=False)