# Python Basics for Data Analysis

This notebook covers the basics of python for data analysis tasks:
- Variables & Data Types
- Control Flow(if/else, loops)
- Functions
- Data Structures(lists, dictionaries)
- Using External Libraries
- Using Pandas (An Introduction)
- Example Problem

Author: Aranya Sharma
Date: July 2025

## Variables & Data Types

In [1]:
# Integer
a = 10
print(a, type(a))

# Float
b = 3.14
print(b, type(b))

# String
c = "Hello"
print(c, type(c))

# Boolean
d = True
print(d, type(d))

10 <class 'int'>
3.14 <class 'float'>
Hello <class 'str'>
True <class 'bool'>


Here we define variables of different types, print their values and data types.

## Control Flow - Loops and Conditionals

In [2]:
# If-else
x = 10
if x > 0:
    print("Positive")
else:
    print("Non-positive")

# For loop
for i in range(10):
    print(i)

# While loop
count = 0
while count < 10:
    print(count)   
    count += 1 

Positive
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9


Conditionals allow you to control which code to be executed next when there are multiple options. 
Loops are a way to repeatedly execute the same code for desired frequency.

## Functions

In [3]:
def greet(name):
    return f"Hello, {name}!"
    
print(greet("Aranya"))    

Hello, Aranya!


Functions allow you to reuse code and make it modular.

## Lists, Tuples & Dictionaries

In [4]:
# List
my_list = [1, 2, 3, 4]
print(my_list)

# Tuple
my_tuple = (1, 2, 3, 4)
print(my_tuple)

# Dictionary
my_dict = {"name": "Aranya", "age": 21}
print(my_dict)

[1, 2, 3, 4]
(1, 2, 3, 4)
{'name': 'Aranya', 'age': 21}


Lists are mutable sequences, tuples are immutable, and dictionaries store key-value pairs.

## Using External Libraries

Python has a rich ecosystem of external libraries that make data analysis easier. Here we have imported some popular ones.

In [5]:
# Importing the math library
import math

print("Square root of 4 is:", math.sqrt(4))

# Importing the random library
import random

print("A random number between 1 and 100:", random.randint(1, 100))

# Importing the numpy library
import numpy as np

arr = np.array([1, 2, 3, 4])
print("Numpy array:", arr)
print("Mean of array:", np.mean(arr))

Square root of 4 is: 2.0
A random number between 1 and 100: 61
Numpy array: [1 2 3 4]
Mean of array: 2.5


Here we have used :
- `math` for mathematical operations.
- `random` for generating random numbers.
- `numpy` for numerical computations on arrays.

## Using Pandas (An Introduction)

In [6]:
# Importing pandas and creating a simple DataFrame
import pandas as pd

data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Age": [25, 30, 28]
}

df = pd.DataFrame(data)
print(df)

      Name  Age
0    Alice   25
1      Bob   30
2  Charlie   28


## Example Problem: Analyze Random Numbers

We generate a list of 50 random integers between 1 and 100 (inclusive), convert them into a numpy array, and then compute basic statistics: mean, median and standard deviation. Additionally, we convert the data into a pandas DataFrame and display the first few rows.

In [7]:
import random
import numpy as np
import pandas as pd

# Generate 50 random integers between 1 and 100(inclusive)
random_numbers = [random.randint(1, 50) for _ in range(101)]

# Convert to numpy array
arr = np.array(random_numbers)

# Calculate statistics
mean = np.mean(arr)
median = np.median(arr)
std_dev = np.std(arr)

print(f"Mean: {mean}")
print(f"Median: {median}")
print(f"Standard Deviation: {std_dev}")

# Additional: Create a pandas DataFrame
df = pd.DataFrame(arr, columns=["Random Numbers"])
print(df.head())

Mean: 26.81188118811881
Median: 27.0
Standard Deviation: 13.824009012366947
   Random Numbers
0              17
1               9
2              20
3              21
4              49
