## Task 1 — Books Data Analysis

You have book data with the following fields:
**Title**, **Author**, **Publication Year**, **Price**.

### Objectives:
- Create a DataFrame with **at least six rows**.
- Display the **entire DataFrame**.
- Calculate the **average price** of the books.
- Find the books **published after 2015**.
- Sort the DataFrame by **price in ascending order**.


In [2]:
import pandas as pd

books = pd.DataFrame({
    "Title": ["Clean Code", "Deep Learning", "Python Crash Course", "The Pragmatic Programmer", "Fluent Python", "Hands-On ML"],
    "Author": ["Robert C. Martin", "Ian Goodfellow", "Eric Matthes", "Andrew Hunt", "Luciano Ramalho", "Aurélien Géron"],
    "Publication Year": [2008, 2016, 2019, 1999, 2015, 2017],
    "Price": [28.50, 45.00, 32.00, 30.00, 40.00, 50.00]
})

print("Full DataFrame:")
print(books)

avg_price = books["Price"].mean()
print("Average book price:", avg_price)

after_2015 = books[books["Publication Year"] > 2015]
print("\nBooks published after 2015:")
print(after_2015)

sorted_by_price = books.sort_values(by="Price", ascending=True)
print("\nSorted by price (ascending):")
print(sorted_by_price)


Full DataFrame:
                      Title            Author  Publication Year  Price
0                Clean Code  Robert C. Martin              2008   28.5
1             Deep Learning    Ian Goodfellow              2016   45.0
2       Python Crash Course      Eric Matthes              2019   32.0
3  The Pragmatic Programmer       Andrew Hunt              1999   30.0
4             Fluent Python   Luciano Ramalho              2015   40.0
5               Hands-On ML    Aurélien Géron              2017   50.0
Average book price: 37.583333333333336

Books published after 2015:
                 Title          Author  Publication Year  Price
1        Deep Learning  Ian Goodfellow              2016   45.0
2  Python Crash Course    Eric Matthes              2019   32.0
5          Hands-On ML  Aurélien Géron              2017   50.0

Sorted by price (ascending):
                      Title            Author  Publication Year  Price
0                Clean Code  Robert C. Martin              200

## Task 2 — Orders Data Analysis

A CSV file contains order data with the following fields:
**Order Number**, **Client**, **Date**, **Amount**.

### Objectives:
- Read the data from the CSV file into a DataFrame.
- Display the **first ten rows** of the DataFrame.
- Determine the **number of orders for each client**.
- Find the **maximum and minimum order amounts**.
- Calculate the **total amount of all orders**.

In [3]:
import pandas as pd

orders = pd.read_csv("orders.csv")

print("First 10 rows of the DataFrame:")
print(orders.head(10))

orders_per_client = orders["Client"].value_counts()
print("\nNumber of orders per client:")
print(orders_per_client)

max_amount = orders["Amount"].max()
min_amount = orders["Amount"].min()

print("\nMaximum order amount:", max_amount)
print("Minimum order amount:", min_amount)

total_amount = orders["Amount"].sum()
print("\nTotal amount of all orders:", total_amount)


First 10 rows of the DataFrame:
   OrderNumber   Client        Date  Amount
0         1001    Alice  2024-01-10     250
1         1002      Bob  2024-01-11     180
2         1003    Alice  2024-01-12     320
3         1004  Charlie  2024-01-13     150
4         1005      Bob  2024-01-14     400
5         1006    Alice  2024-01-15     210
6         1007  Charlie  2024-01-16     500
7         1008      Bob  2024-01-17     130
8         1009    Alice  2024-01-18     275
9         1010  Charlie  2024-01-19     350

Number of orders per client:
Client
Alice      5
Bob        3
Charlie    3
Name: count, dtype: int64

Maximum order amount: 500
Minimum order amount: 130

Total amount of all orders: 2955


## Task 3 — Food Products Data Analysis

You have a table with food product data containing the following fields:
**Product**, **Category**, **Calories**, **Protein**.

### Objectives:
- Create a DataFrame with **at least ten rows**.
- Display the **entire DataFrame**.
- Find all products with **calorie content greater than 300**.
- Calculate the **average protein content by category**.
- Sort the DataFrame by **calories in descending order**.


In [4]:
import pandas as pd

food = pd.DataFrame({
    "Product": [
        "Pizza", "Burger", "Salad", "Pasta", "Steak",
        "Soup", "Cake", "Fish", "Rice", "Chicken"
    ],
    "Category": [
        "Fast Food", "Fast Food", "Healthy", "Fast Food", "Meat",
        "Healthy", "Dessert", "Seafood", "Healthy", "Meat"
    ],
    "Calories": [400, 350, 150, 320, 500, 180, 450, 300, 280, 330],
    "Protein": [12, 15, 5, 10, 40, 8, 6, 22, 7, 35]
})

print("Full DataFrame:")
print(food)

high_calorie = food[food["Calories"] > 300]
print("\nProducts with calories greater than 300:")
print(high_calorie)

avg_protein = food.groupby("Category")["Protein"].mean()
print("\nAverage protein content by category:")
print(avg_protein)

sorted_food = food.sort_values(by="Calories", ascending=False)
print("\nSorted by calories (descending):")
print(sorted_food)


Full DataFrame:
   Product   Category  Calories  Protein
0    Pizza  Fast Food       400       12
1   Burger  Fast Food       350       15
2    Salad    Healthy       150        5
3    Pasta  Fast Food       320       10
4    Steak       Meat       500       40
5     Soup    Healthy       180        8
6     Cake    Dessert       450        6
7     Fish    Seafood       300       22
8     Rice    Healthy       280        7
9  Chicken       Meat       330       35

Products with calories greater than 300:
   Product   Category  Calories  Protein
0    Pizza  Fast Food       400       12
1   Burger  Fast Food       350       15
3    Pasta  Fast Food       320       10
4    Steak       Meat       500       40
6     Cake    Dessert       450        6
9  Chicken       Meat       330       35

Average protein content by category:
Category
Dessert       6.000000
Fast Food    12.333333
Healthy       6.666667
Meat         37.500000
Seafood      22.000000
Name: Protein, dtype: float64

Sorted by c

## Task 4 — Employees and Projects Analysis

Employee project data is collected with the following fields:
**Name**, **Project**, **Hours**.

### Objectives:
- Create a DataFrame with **at least eight rows**.
- Display the **original DataFrame**.
- Calculate the **total number of hours per employee**.
- Calculate the **total number of hours per project**.
- Identify the **employee who spent the most hours**.

In [None]:
import pandas as pd

employees = pd.DataFrame({
    "Name": [
        "Alice", "Bob", "Charlie", "Alice",
        "Bob", "Charlie", "Alice", "Bob"
    ],
    "Project": [
        "Project A", "Project A", "Project A",
        "Project B", "Project B", "Project B",
        "Project C", "Project C"
    ],
    "Hours": [10, 8, 12, 15, 20, 10, 18, 14]
})

print("Original DataFrame:")
print(employees)

hours_per_employee = employees.groupby("Name")["Hours"].sum()
print("\nTotal hours per employee:")
print(hours_per_employee)

hours_per_project = employees.groupby("Project")["Hours"].sum()
print("\nTotal hours per project:")
print(hours_per_project)

top_employee = hours_per_employee.idxmax()
print("\nEmployee with the most hours:", top_employee)


## Task 5 — Ticket Sales Analysis

You have a table containing ticket sales data with the following fields:
**Movie**, **City**, **Tickets Sold**.

### Objectives:
- Create a DataFrame with **at least twelve rows**.
- Display the **entire DataFrame**.
- Calculate the **total number of tickets sold for each movie**.
- Calculate the **total number of tickets sold for each city**.
- Identify the **movie with the highest number of ticket sales**.


In [None]:
import pandas as pd

tickets = pd.DataFrame({
    "Movie": [
        "Movie A", "Movie A", "Movie B", "Movie B",
        "Movie C", "Movie C", "Movie D", "Movie D",
        "Movie E", "Movie E", "Movie F", "Movie F"
    ],
    "City": [
        "New York", "Los Angeles", "New York", "Los Angeles",
        "New York", "Los Angeles", "New York", "Los Angeles",
        "New York", "Los Angeles", "New York", "Los Angeles"
    ],
    "Tickets Sold": [120, 150, 200, 180, 90, 110, 300, 250, 160, 170, 220, 210]
})

print("Full DataFrame:")
print(tickets)

tickets_per_movie = tickets.groupby("Movie")["Tickets Sold"].sum()
print("\nTotal tickets sold per movie:")
print(tickets_per_movie)

tickets_per_city = tickets.groupby("City")["Tickets Sold"].sum()
print("\nTotal tickets sold per city:")
print(tickets_per_city)

top_movie = tickets_per_movie.idxmax()
print("\nMovie with the highest ticket sales:", top_movie)
