# Part 1: Database Management for Big Data

## Introduction to Database Management
- Big data projects often require databases that can handle large, varied, and rapidly changing datasets.
- NoSQL databases like TinyDB offer flexibility, speed, and scalability, making them suitable for big data applications.

## Importance of NoSQL Databases
- **Flexibility:** NoSQL databases allow a variety of data models, including key-value, document, graph, and wide-column stores, which makes them adaptable to various data types and structures.
- **Scalability:** Designed to scale out by using distributed architecture, which is critical for big data environments.
- **Schema-less:** NoSQL databases do not require a predefined schema, allowing you to work with unstructured data.

## Python Library: TinyDB
- TinyDB is a lightweight, document-oriented database optimized for small projects and prototyping within Python environments. It stores data in JSON format, providing an easy and flexible way to manage data.

## Key Concepts in NoSQL with TinyDB
- **Document-Oriented Storage:** Data is stored in documents that are grouped into collections. Each document can contain different data types.
- **CRUD Operations:** TinyDB supports basic CRUD (Create, Read, Update, Delete) operations which are fundamental for database management.
- **Querying:** TinyDB provides a powerful querying mechanism to fetch and manipulate data efficiently.

## Real-World Application: Inventory Management
- NoSQL databases are particularly useful in scenarios like inventory management for e-commerce or retail businesses, where data structures can vary and rapid changes are common.


# Part 2: Follow Me - Managing a Hardware Store Inventory with TinyDB

In [None]:
#! pip install tinydb

Collecting tinydb
  Downloading tinydb-4.8.0-py3-none-any.whl (24 kB)
Installing collected packages: tinydb
Successfully installed tinydb-4.8.0


In [None]:
from tinydb import TinyDB, Query

Installation instructions (to be executed in the shell environment):
pip install tinydb

In [None]:
# Step 1: Initialize the TinyDB Database
# 'inventory_db.json' will be the file where all our data is stored. TinyDB uses this file to store data in JSON format.
db = TinyDB('inventory_db.json')

In [None]:
# Step 2: Inserting Items into the Inventory
# Here we insert multiple items into the database. Each item is a 'document' in TinyDB terminology.
# The documents are schema-less, which means each document can have a different structure,
# but we maintain a consistent structure to simplify our inventory system.
db.insert({'type': 'tool', 'name': 'hammer', 'quantity': 75, 'price': 22.50})
db.insert({'type': 'tool', 'name': 'screwdriver', 'quantity': 50, 'price': 9.99})
db.insert({'type': 'material', 'name': 'nail', 'quantity': 1000, 'price': 0.10})

3

Explanation of Schema:
Each document here represents an inventory item with four fields:
- type (string): Category of the item (e.g., tool, material).
- name (string): The specific name of the item (e.g., hammer, nail).
- quantity (integer): The number of items in stock.
- price (float): The cost of one item.

In [None]:
# Step 3: Querying Tools in the Inventory
# We use TinyDB's Query object to search through the database.
# This example demonstrates how to find all items where the 'type' is 'tool'.
Item = Query()
tools = db.search(Item.type == 'tool')
print("Available Tools:", tools)

Available Tools: [{'type': 'tool', 'name': 'hammer', 'quantity': 75, 'price': 22.5}, {'type': 'tool', 'name': 'screwdriver', 'quantity': 50, 'price': 9.99}]


In [None]:
# Step 4: Updating the Quantity of Hammers
# Updates are performed by specifying a condition and the new data to apply.
# Here, we update the quantity of 'hammer' by setting it to 100.
db.update({'quantity': 100}, Item.name == 'hammer')

[1]

In [None]:
# Step 5: Removing Discontinued Items
# To remove an item, we specify a condition that matches the item to be removed.
# This command removes the item 'nail' from the database.
db.remove(Item.name == 'nail')

[3]

In [None]:

# Query 1: Find all items with a quantity less than 100
low_stock_items = db.search(Item.quantity < 100)
print("Items with Low Stock:", low_stock_items)

Items with Low Stock: [{'type': 'tool', 'name': 'screwdriver', 'quantity': 50, 'price': 9.99}]


In [None]:
# Query 2: Find all items with a price over $10
high_price_items = db.search(Item.price > 10)
print("High Priced Items:", high_price_items)


High Priced Items: [{'type': 'tool', 'name': 'hammer', 'quantity': 100, 'price': 22.5}]


In [None]:
# Query 3: Count of all tools in the inventory
tool_count = len(db.search(Item.type == 'tool'))
print("Total Number of Tools:", tool_count)

Total Number of Tools: 2


# Part 3: Your Turn - Advanced Database Management Task

This part of the course will engage you in a real-world database management project. You'll apply the techniques you've learned to design and manage a database using TinyDB, handling various data complexities such as schema design, complex queries, and data relationships.

## Database Design:
- Design a database schema that fits a real-world application scenario you choose (e.g., a bookstore, a sports team management system, a personal contacts manager). Although TinyDB is schema-less, plan a logical structure for your documents to ensure consistency.

## Complex Queries:
- Develop complex queries that can handle multiple fields and conditions to extract meaningful information from the database.

## Data Relationships:
- Implement and manage one-to-many and many-to-many relationships between documents. For instance, in a bookstore, a single author might have multiple books.

## Data Integrity:
- Ensure the integrity of your database by implementing checks that prevent the insertion of duplicate or conflicting data.

## Performance Evaluation:
- Assess and document the performance of your database system, particularly focusing on the efficiency of your queries and the scalability of your database design.

## Aggregation and Reporting:
- Utilize TinyDB's functionalities to perform data aggregations, akin to SQL's 'GROUP BY', and generate reports that summarize key aspects of your data.

## Instructions:
1. Choose a real-world application scenario and design a database schema that fits the needs of this scenario.
2. Populate your database with initial data and implement the database operations required to manage this data.
3. Create complex queries and manage data relationships within your database.
4. Assess the performance of your database system and make necessary adjustments.
5. Compile your steps and insights into the Jupyter notebook and submit it as your completed assignment.