# 1. Introduction to Data Science

Data science is an interdisciplinary field that uses scientific methods, algorithms, processes, and systems to extract insights and knowledge from structured and unstructured data. It combines knowledge from various domains such as statistics, mathematics, computer science, and domain-specific fields to analyze and interpret complex data sets.



## What is Data Science?

Data science involves the following key components:

### 1. Data Collection
Data collection is the process of gathering relevant data from various sources. This may include databases, sensors, APIs, web scraping, and more. The quality and quantity of data collected are crucial for the success of data science projects.

### 2. Data Cleaning and Preprocessing
Data cleaning involves identifying and correcting errors, inconsistencies, and missing values in the data set. Preprocessing includes transforming raw data into a format suitable for analysis, which may involve normalization, scaling, and feature engineering.

### 3. Exploratory Data Analysis (EDA)
EDA is the process of analyzing and visualizing data to uncover patterns, trends, and relationships. It helps in gaining insights into the data set and identifying potential variables that influence the outcome.

### 4. Machine Learning and Statistical Modeling
Machine learning algorithms and statistical models are used to build predictive and descriptive models from data. This involves training models on historical data and using them to make predictions or gain insights into future outcomes.

### 5. Evaluation and Interpretation
Once models are trained, they need to be evaluated to assess their performance and generalization capabilities. Interpretation of model results is crucial for making data-driven decisions and extracting actionable insights.

### 6. Deployment and Maintenance
Deploying models into production environments and maintaining their performance over time is essential for real-world applications. This involves monitoring model performance, retraining models with new data, and updating them as needed.

## Why Data Science?

Data science has become increasingly important in various industries due to the following reasons:

- **Data-driven Decision Making:** Organizations can make informed decisions based on data-driven insights rather than relying on intuition or guesswork.
- **Predictive Analytics:** Data science enables organizations to predict future trends, customer behavior, and market changes, allowing them to stay ahead of the competition.
- **Personalization:** By analyzing large volumes of data, companies can personalize products, services, and marketing strategies to meet the specific needs of individual customers.
- **Efficiency and Automation:** Data science can automate repetitive tasks, optimize processes, and improve efficiency, leading to cost savings and increased productivity.

In summary, data science plays a crucial role in extracting valuable insights from data, driving innovation, and creating a competitive advantage for organizations across various industries.


# 2. Python basics

## 2.1 Variables and Arithmetic Operations

In [None]:
x = 5
y = 3
print("Sum:", x + y)
print("Difference:", x - y)
print("Product:", x * y)
print("Quotient:", x / y)
print('Power:', x**y)

Sum: 8
Difference: 2
Product: 15
Quotient: 1.6666666666666667
Power: 125


## 2.2 Strings and string operations

In [None]:
name = "John"
print("Hello,", name)
print("Length of name:", len(name))
print("Uppercase:", name.upper())
print("Lowercase:", name.lower())

# String Concatenation
first_name = "John"
last_name = "Doe"
full_name = first_name + " " + last_name
print("Full name:", full_name)

# String Repetition
greeting = "Hello! "
repeated_greeting = greeting * 3
print("Repeated greeting:", repeated_greeting)

# String Slicing
sliced_name = name[1:3]
print("Sliced name (1:3):", sliced_name)

# String Replacement
replaced_name = name.replace("o", "a")
print("Replaced 'o' with 'a':", replaced_name)

# Check if string starts with a specific substring
starts_with_j = name.startswith("J")
print("Starts with 'J':", starts_with_j)

# Check if string ends with a specific substring
ends_with_n = name.endswith("n")
print("Ends with 'n':", ends_with_n)

# Splitting a string
split_name = full_name.split()
print("Split full name:", split_name)

# Joining a list of strings
joined_name = " ".join(split_name)
print("Joined name:", joined_name)

# Finding a substring
position = name.find("o")
print("Position of 'o':", position)

# Counting occurrences of a character
count_o = name.count("o")
print("Count of 'o':", count_o)


Hello, John
Length of name: 4
Uppercase: JOHN
Lowercase: john


# 3 Data types

In [None]:
# Integers and floats
num_int = 10
num_float = 3.14

# Strings
str1 = "Hello"
str2 = 'World'

# Lists  - mutable
numbers = [1, 2, 3, 4, 5]

# Tuples - immutable
my_tuple = (1,2,3)

# Dictionaries (key-value pair)
my_dict = {"name": "John", "age": 30, "city": "New York", "major": "Computer Science"}

# Boolean
is_true = True
is_false = False

## 3.1. Lists

In [None]:
# Example list
my_list = [10, 20, 30, 40, 50]
print("List:", my_list)
print("Length of list:", len(my_list))
print("First element:", my_list[0])
print("Last element:", my_list[4])

List: [10, 20, 30, 40, 50]
Length of list: 5
First element: 10
Last element: 50


In [None]:
list1 = [2,4,5,7,9,0,10,12]
sliced_list = list1[::-2]  # Elements from index 1 to index 3 (exclusive)
sliced_list

[12, 0, 7, 4]

In [None]:
# Indexing and Other Functions with Lists

# Lists are versatile data structures in Python that allow us to store collections of items.

# Example list
my_list = [10, 20, 30, 40, 50]
print("List:", my_list)
print("Length of list:", len(my_list))
print("First element:", my_list[0])
print("Last element:", my_list[-1])

# Indexing
# Accessing elements in a list using index values
first_element = my_list[0]  # Index 0 corresponds to the first element
second_element = my_list[1]  # Index 1 corresponds to the second element
last_element = my_list[-1]  # Negative indexing to access the last element
second_last_element = my_list[-2]  # Negative indexing to access the second last element

# Slicing
# Extracting a subsequence of elements from a list
sliced_list = my_list[1:4]  # Elements from index 1 to index 3 (exclusive)
reversed_list = my_list[::-1]  # Reversing the list

# List Functions

# Length of a list
length_of_list = len(my_list)

# Append
# Adding an element to the end of the list
my_list.append(60)

# Insert
# Inserting an element at a specified index position
my_list.insert(2, 25)  # Inserting 25 at index 2

# Remove
# Removing the first occurrence of a specified value from the list
my_list.remove(30)

# Pop
# Removing and returning the element at a specified index
popped_element = my_list.pop(3)  # Removing element at index 3

# Count
# Counting the number of occurrences of a specified element in the list
count_20 = my_list.count(20)

# Sorting
# Sorting the elements of the list in ascending order
my_list.sort()

# Reverse
# Reversing the order of elements in the list
my_list.reverse()

# Printing the results
print("First element:", first_element)
print("Second element:", second_element)
print("Last element:", last_element)
print("Second last element:", second_last_element)
print("Sliced list:", sliced_list)
print("Reversed list:", reversed_list)
print("Length of list:", length_of_list)
print("Appended list:", my_list)
print("Inserted list:", my_list)
print("After removal:", my_list)
print("Popped element:", popped_element)
print("Count of 20:", count_20)
print("Sorted list:", my_list)
print("Reversed list:", my_list)


List: [10, 20, 30, 40, 50]
Length of list: 5
First element: 10
Last element: 50
First element: 10
Second element: 20
Last element: 50
Second last element: 40
Sliced list: [20, 30, 40]
Reversed list: [50, 40, 30, 20, 10]
Length of list: 5
Appended list: [60, 50, 25, 20, 10]
Inserted list: [60, 50, 25, 20, 10]
After removal: [60, 50, 25, 20, 10]
Popped element: 40
Count of 20: 1
Sorted list: [60, 50, 25, 20, 10]
Reversed list: [60, 50, 25, 20, 10]


## 3.2 Dictionary

In [None]:
# Checking if a Key Exists
# We can check if a key exists in a dictionary using the 'in' keyword.
is_name_present = "name" in my_dict  # Returns True if "name" is present in the dictionary keys

# Printing the Results
# print("Updated Dictionary:", my_dict)
print("Is 'name' present?", is_name_present)

Is 'name' present? True


In [None]:
my_dict.values()

dict_values(['John', 30, 'Computer Science', 'Engineer'])

In [None]:
# Dictionaries are another essential data type in Python. They allow us to store key-value pairs.

# Example dictionary
my_dict = {"name": "John", "age": 30, "city": "New York", "major": "Computer Science"}
print("Dictionary:", my_dict)
print("Name:", my_dict["name"]) # Accessing value associated with the key "name"
print("Age:", my_dict["age"])  # Accessing value associated with the key "age"

# Adding a New Key-Value Pair
# New key-value pairs can be added to a dictionary using assignment.
my_dict["occupation"] = "Engineer"

# Removing a Key-Value Pair
# Key-value pairs can be removed from a dictionary using the del keyword.
del my_dict["city"]

# Checking if a Key Exists
# We can check if a key exists in a dictionary using the 'in' keyword.
is_name_present = "name" in my_dict  # Returns True if "name" is present in the dictionary keys

# Printing the Results
# print("Updated Dictionary:", my_dict)
print("Is 'name' present?", is_name_present)

Dictionary: {'name': 'John', 'age': 30, 'city': 'New York', 'major': 'Computer Science'}
Name: John
Age: 30
Is 'name' present? True


# 4. Conditional statements, Loops and functions

In [None]:
my_list1=[1,2,3,4,5,6]
my_list=[1,2,3,4,5]
for num in my_list1:
  print(my_list)
  print(num)

[1, 2, 3, 4, 5]
1
[1, 2, 3, 4, 5]
2
[1, 2, 3, 4, 5]
3
[1, 2, 3, 4, 5]
4
[1, 2, 3, 4, 5]
5
[1, 2, 3, 4, 5]
6


In [None]:
def classify_number(num):
    if num > 0:
        return "Positive"
    elif num < 0:
        return "Negative"
    else:
        return "Zero"

print("Classification of -3:", classify_number(-3))  # Output: Negative

Classification of -3: Negative


In [None]:
# Here's an example of a function that greets a person differently based on the time of day.
def greet_time(name, hour):
    if hour < 12:
        return "Good morning, " + name + "!"
    elif hour < 18:
        return "Good afternoon, " + name + "!"
    else:
        return "Good evening, " + name + "!"

print(greet_time("John", 9))  # Output: "Good morning, John!"

Good morning, John!


In [None]:
# For Loops, While Loops, Conditional Statements, and Functions

# These are fundamental concepts in Python that allow us to perform repetitive tasks, make decisions, and encapsulate reusable blocks of code.

# Example 1: For Loop
# For loops are used to iterate over a sequence (e.g., lists, tuples, strings).
# Here's an example of iterating over a list and printing each element.
my_list = [1, 2, 3, 4, 5]
for num in my_list:
    print(num)

# Example 2: While Loop
# While loops are used to execute a block of code repeatedly as long as a condition is true.
# Here's an example of using a while loop to print numbers from 1 to 5.
count = 1
while count <= 5:
    print(count)
    count += 1

# Example 3: Conditional Statements (if, elif, else)
# Conditional statements are used to make decisions based on conditions.
# Here's an example of using conditional statements to determine if a number is positive, negative, or zero.
def classify_number(num):
    if num > 0:
        return "Positive"
    elif num < 0:
        return "Negative"
    else:
        return "Zero"

print("Classification of -3:", classify_number(-3))  # Output: Negative


# Example 4: Function with Conditional Statement
# Functions can contain conditional statements to perform different actions based on conditions.
# Here's an example of a function that greets a person differently based on the time of day.
def greet_time(name, hour):
    if hour < 12:
        return "Good morning, " + name + "!"
    elif hour < 18:
        return "Good afternoon, " + name + "!"
    else:
        return "Good evening, " + name + "!"

print(greet_time("John", 9))  # Output: "Good morning, John!"




In [None]:
# Example 10: Function to Find Maximum Number
def find_max(a, b):
    if a > b:
        return a
    else:
        return b

print("Maximum of 5 and 7:", find_max(5, 7))

Maximum of 5 and 7: 7


In [None]:
# Example 7: For Loop
for i in range(2,10,2):
    print(i)

2
4
6
8


In [None]:
# Section 3: Loops

# Loops are used to execute a block of code repeatedly.

# Example 7: For Loop
for i in range(5):
    print(i)

# Example 8: While Loop
count = 0
while count < 5:
    print(count)
    count += 1

# Section 4: Functions

# Functions are blocks of reusable code that perform a specific task.

# Example 9: Function to Calculate Square
def square(num):
    return num ** 2

print("Square of 4:", square(4))

# Example 10: Function to Find Maximum Number
def find_max(a, b):
    if a > b:
        return a
    else:
        return b

print("Maximum of 5 and 7:", find_max(5, 7))

In [None]:
# Example 9: Function to Calculate Square
def square(num):
    return num ** 2

print("Square of 4:", square(4))

Square of 4: 16
