# Databases (SQL + NoSQL)

In this notebook, we’ll cover:
- SQL Basics (`SELECT`, `WHERE`, `GROUP BY`, `JOIN`)
- Subqueries
- Normalization basics
- MongoDB fundamentals (collections, documents)
  
We’ll also perform **hands-on tasks** using **SQLite** (for SQL) and **PyMongo** (for MongoDB).


## Section 1: SQL Basics with SQLite
We'll use Python's built-in `sqlite3` library to perform SQL operations.


In [3]:
import sqlite3
import pandas as pd

# Create in-memory database
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()


### Create Table: `students`

In [4]:
cursor.execute('''
CREATE TABLE students (
    id INTEGER PRIMARY KEY,
    name TEXT,
    marks INTEGER,
    subject TEXT
)
''')

# Insert sample data
students_data = [
    (1, 'Alice', 95, 'Math'),
    (2, 'Bob', 75, 'Science'),
    (3, 'Charlie', 82, 'Math'),
    (4, 'David', 90, 'English'),
    (5, 'Eva', 70, 'Science')
]

cursor.executemany('INSERT INTO students VALUES (?, ?, ?, ?)', students_data)
conn.commit()


### Query 1: Find students with marks > 80

In [5]:
query1 = "SELECT * FROM students WHERE marks > 80"
df1 = pd.read_sql_query(query1, conn)
df1


Unnamed: 0,id,name,marks,subject
0,1,Alice,95,Math
1,3,Charlie,82,Math
2,4,David,90,English


### Query 2: Get average marks by subject (`GROUP BY`)

In [6]:
query2 = "SELECT subject, AVG(marks) as avg_marks FROM students GROUP BY subject"
df2 = pd.read_sql_query(query2, conn)
df2

Unnamed: 0,subject,avg_marks
0,English,90.0
1,Math,88.5
2,Science,72.5


## Section 2: Mini Project — Movies Database
Let’s create a small movie database and perform some analytical queries.


In [8]:
cursor.execute('''
CREATE TABLE movies (
    id INTEGER PRIMARY KEY,
    title TEXT,
    rating REAL,
    genre TEXT
)
''')

movies_data = [
    (1, 'Inception', 8.8, 'Sci-Fi'),
    (2, 'The Dark Knight', 9.0, 'Action'),
    (3, 'Interstellar', 8.6, 'Sci-Fi'),
    (4, 'Parasite', 8.6, 'Thriller'),
    (5, 'Titanic', 7.8, 'Romance')
]

cursor.executemany('INSERT INTO movies VALUES (?, ?, ?, ?)', movies_data)
conn.commit()


### Query 1: Find top 3 movies by rating

In [9]:
query3 = "SELECT title, rating FROM movies ORDER BY rating DESC LIMIT 3"
df3 = pd.read_sql_query(query3, conn)
df3


Unnamed: 0,title,rating
0,The Dark Knight,9.0
1,Inception,8.8
2,Interstellar,8.6


### Query 2: Count movies by genre

In [10]:
query4 = "SELECT genre, COUNT(*) as movie_count FROM movies GROUP BY genre"
df4 = pd.read_sql_query(query4, conn)
df4

Unnamed: 0,genre,movie_count
0,Action,1
1,Romance,1
2,Sci-Fi,2
3,Thriller,1


## Section 3: MongoDB Basics (NoSQL)
For this part, you’ll need MongoDB installed or use a cloud service like MongoDB Atlas.  
We’ll use the `pymongo` library to connect and perform operations.


In [11]:
# Uncomment and install PyMongo if needed
!pip install pymongo


Collecting pymongo
  Downloading pymongo-4.15.3-cp312-cp312-win_amd64.whl.metadata (22 kB)
Collecting dnspython<3.0.0,>=1.16.0 (from pymongo)
  Using cached dnspython-2.8.0-py3-none-any.whl.metadata (5.7 kB)
Downloading pymongo-4.15.3-cp312-cp312-win_amd64.whl (910 kB)
   ---------------------------------------- 0.0/910.9 kB ? eta -:--:--
   ----------- ---------------------------- 262.1/910.9 kB ? eta -:--:--
   ----------- ---------------------------- 262.1/910.9 kB ? eta -:--:--
   ----------- ---------------------------- 262.1/910.9 kB ? eta -:--:--
   --------------------- ---------------- 524.3/910.9 kB 541.6 kB/s eta 0:00:01
   --------------------- ---------------- 524.3/910.9 kB 541.6 kB/s eta 0:00:01
   --------------------- ---------------- 524.3/910.9 kB 541.6 kB/s eta 0:00:01
   -------------------------------- ----- 786.4/910.9 kB 424.9 kB/s eta 0:00:01
   -------------------------------- ----- 786.4/910.9 kB 424.9 kB/s eta 0:00:01
   -------------------------------------

### Connect to MongoDB and Insert Documents

In [13]:
from pymongo import MongoClient

# Connect to local MongoDB (make sure MongoDB is running)
client = MongoClient("mongodb://localhost:27017/")

# Create or access database
db = client["week6_demo"]

# Create or access collection
collection = db["users"]

# Insert sample JSON-like documents
users_data = [
    {"name": "John", "age": 25, "skills": ["Python", "SQL"]},
    {"name": "Sara", "age": 28, "skills": ["R", "Machine Learning"]},
    {"name": "Tom", "age": 22, "skills": ["Python", "MongoDB"]},
    {"name": "Lily", "age": 30, "skills": ["C++", "Java"]}
]

collection.insert_many(users_data)


InsertManyResult([ObjectId('68ede31aea1edfb40526a126'), ObjectId('68ede31aea1edfb40526a127'), ObjectId('68ede31aea1edfb40526a128'), ObjectId('68ede31aea1edfb40526a129')], acknowledged=True)

### Query: Find all users with skill = "Python"

In [14]:
python_users = collection.find({"skills": "Python"})
for user in python_users:
    print(user)


{'_id': ObjectId('68ede31aea1edfb40526a126'), 'name': 'John', 'age': 25, 'skills': ['Python', 'SQL']}
{'_id': ObjectId('68ede31aea1edfb40526a128'), 'name': 'Tom', 'age': 22, 'skills': ['Python', 'MongoDB']}


## Summary
In this notebook, you:
- Practiced SQL basics: SELECT, WHERE, GROUP BY, JOIN  
- Created and queried SQLite databases  
- Learned MongoDB basics with PyMongo  
- Practiced NoSQL document insertion and queries  

You now have a foundational understanding of **SQL and NoSQL** for data handling and analysis.
