**DAY 1:** Understanding Data Types

**Theory Question**

***Answer 1***

| Feature        | Primitive Data Types              | Non-Primitive Data Types                              |
| -------------- | --------------------------------- | ----------------------------------------------------- |
| **Definition** | Store simple, single values       | Store collections or complex structures               |
| **Examples**   | `int`, `float`, `bool`, `str`     | `list`, `tuple`, `dict`, `set`, `class`               |
| **Mutability** | Immutable                         | Mostly mutable (except `tuple`)                       |
| **Value Type** | Direct simple value               | Can contain multiple values/objects                   |
| **Methods**    | Few methods (e.g., `str.upper()`) | Many methods for manipulation (e.g., `list.append()`) |


***Answer 2***

 **Why it‚Äôs Important**

1. **Memory Efficiency**

   * Using the right data type saves memory.
   * Example: Storing age as an `int` is efficient; storing it as a `float` or `string` wastes space.

2. **Performance (Speed)**

   * Correct data types make operations faster.
   * Adding two `int` values is much faster than converting two `strings` into numbers first.

3. **Data Accuracy**

   * Prevents errors like rounding issues or invalid values.
   * Example: Use `float` for precise measurements (like temperature), not `int`.

4. **Code Reliability & Fewer Bugs**

   * Ensures only valid operations happen.
   * Example: You can‚Äôt do `"Hello" / 2` if you store text as a string (Python will throw an error).

5. **Readability & Maintenance**

   * Choosing the right type makes the code easier for others (and yourself) to understand.

----

 **Real-Life Example**

Imagine you are building a **banking application** 

* **Balance of an account** should be stored as a `float` (e.g., `1520.75`).

  * If you used `int`, it would drop the decimal part and you‚Äôd lose cents/paise, which is a big problem in banking.

* **Account number** should be stored as a `string`, not an `int`.

  * Why? Because account numbers are identifiers, not something you‚Äôll add/subtract.
  * If stored as an `int`, leading zeros (e.g., `00123456`) would be lost ‚Üí becomes `123456`, which is **wrong**.

* **Customer‚Äôs age** should be stored as an `int`.

  * If you mistakenly use `string`, you won‚Äôt be able to compare ages properly (`"25" > "100"` would give wrong results).

---




***Answer 3***

üîπ How Memory Allocation and Data Type Influence Program Performance
1. **Memory Usage (Space Efficiency)**

* Each data type occupies a different amount of memory.

* Choosing a larger type than necessary wastes memory.

* Choosing a smaller type may cause overflow or loss of precision.

**Example (Python integers vs. floats):**

x = 10       # int (takes less memory, faster for counting/looping)
y = 10.0     # float (takes more memory, slower in math ops)

2. **Execution Speed (Time Efficiency)**

* Some operations are faster on certain types.

* Example: Adding two integers is faster than adding two strings representing numbers.

**Example:**

* Adding integers (fast)
a, b = 100, 200
print(a + b)   # 300

* Adding numbers stored as strings (slower + risk of error)
x, y = "100", "200"
print(x + y)   # "100200" (concatenation, not addition!)

*  Wrong type (str instead of int) changes both meaning and performance.

3. **Precision and Accuracy**

*Data type decides how much detail a value can store.

*Using the wrong type can cause rounding errors or overflow.

**Example (float precision issue in finance):**

balance = 100.10
payment = 0.30
print(balance - payment * 3)  
 Expected: 99.20
 Output: 99.19999999999999  (precision error)


In banking systems, this is unacceptable ‚Üí so programmers use Decimal type instead of float for accuracy.

4. **References vs. Values**

* Primitive (immutable) types (like int, float, bool, str) are stored directly.

* Non-primitive (mutable) types (like list, dict) store references.

* Large collections can take more memory and slow down performance if not managed properly.

**Example (list vs. int):**

x = 5        # int ‚Üí tiny memory
y = [5]*1000000   # list with 1 million items ‚Üí huge memory




**Real-Life Example:**
In a gaming application,

* Player‚Äôs health (0‚Äì100) should be int (fast, small).

* Player‚Äôs position (x, y coordinates) should be float (for precision).

* Game inventory should be a list or dict (complex structure).

* Choosing wrong types (e.g., storing health as a string "100") would make the game slower and buggy.

***Answer 4***

**Key Characteristics of String Data Type**

* Definition:

A string is a sequence of characters enclosed in quotes ('...', "...", or '''...''').

* Immutable:

Once created, a string cannot be changed directly.

Any modification creates a new string.

name = "Guneet"
name[0] = "K"   #  Error (strings are immutable)


* Indexing & Slicing:

Strings can be accessed like arrays.

Positive & negative indexing supported.

text = "Python"

   print(text[0])     # 'P'

   print(text[-1])    # 'n'

   print(text[0:3])   # 'Pyt'
 

* Concatenation & Repetition:

Strings can be combined (+) or repeated (*).

print("Hi " + "there")   # "Hi there"
print("Go! " * 3)        # "Go! Go! Go! "


* Built-in Methods:

Strings have many built-in methods (upper(), lower(), find(), replace(), split(), etc.).

message = "hello world"
print(message.upper())   # "HELLO WORLD"


* Versatile:

Can store text, numbers (as characters), and even special symbols.

------

**Common Real-Life Scenario**

* User Input in Applications

When building a login system, usernames, emails, and passwords are stored as strings.

**Example:**

username = "guneet_singh"

password = "mySecureP@ss123"

email = "guneet@example.com"

print(f"Welcome {username}, your email is {email}")

**PRACTICAL QUESTION**

* Accepting inputs

In [12]:
int_val = int(input("Enter an integer: "))        #accepting input as integer
print(int_val,type(int_val))

5 <class 'int'>


In [13]:
float_val = float(input("Enter a float: "))       #accepting input as float
print(float_val,type(float_val))

4.5 <class 'float'>


In [14]:
str_val = input("Enter a string: ")              #accepting input as string
print(f"you entered a string: {str_val}" ,type(str_val) ) 

you entered a string: suppu <class 'str'>


In [18]:
user_input = input("Enter a boolean value (true/false): ").strip().lower()         #accepting input as boolean

if user_input in ['true', 'yes', '1']:
    boolean_value = True
elif user_input in ['false', 'no', '0']:
    boolean_value = False
else:
    print("Invalid boolean input!")
    boolean_value = None

print(f"You entered: {boolean_value}",type(boolean_value))

You entered: True <class 'bool'>


In [None]:
data_dict = {                                            #dictionary 
    "Integer": int_val,
    "Float": float_val,
    "String": str_val,
    "Boolean": boolean_value
}

# Displaying the dictionary
print("\nDictionary of inputs:")
print(data_dict)



Dictionary of inputs:
{'Integer': 5, 'Float': 4.5, 'String': 'suppu', 'Boolean': True}


**Day 2:** Data Structures and Their Applications

**Theory Questions**

***Answer 1***

**Array**

* Definition: A collection of elements stored in contiguous memory locations



**Linked List**

* Definition: A collection of nodes where each node contains:

->The data (value).

->A pointer (reference) to the next node.


| Feature            | Array                       | Linked List                   |
| ------------------ | --------------------------- | ----------------------------- |
| Memory             | Contiguous                  | Non-contiguous (nodes)        |
| Access time        | Fast (O(1)) by index        | Slow (O(n)) ‚Äì sequential      |
| Insertion/Deletion | Expensive (shift elements)  | Efficient (just relink nodes) |
| Size               | Fixed (or costly to resize) | Dynamic (easy to grow/shrink) |


**Real-world Example (Array):**

* Movie theater seats ‚Üí Each seat is numbered (like arr[0], arr[1], arr[2]...). You can quickly jump to seat 50, but if a new seat has to be added in the middle, you‚Äôd need to shift others.



**Real-world Example (Linked List):**

* Playlist in a music app ‚Üí Songs are linked one after another. You can easily insert a new song between two songs, or remove one, without rearranging the whole playlist.

***Answer 2***

A stack is a data structure that works on the LIFO (Last In, First Out) principle ‚Äî the last item added is the first one to be removed.

* Meaning of LIFO:

Last In, First Out means ‚Äî the element that goes in last, comes out first.
Think of it like stacking plates or books ‚Äî the last plate you put on top is the first one you pick up.

* Example: 

1) Stack of Plates in a Canteen

2) You start stacking plates one on top of another.

3) First, you place Plate A

4) Then Plate B

5) Finally, Plate C

6) Now the stack looks like this:
  Top ‚Üí [C, B, A] ‚Üê Bottom

7) When someone comes to take a plate, they take the topmost plate first ‚Äî i.e., Plate C.

8) After removing C, B becomes the top plate and will be removed next.

***Answer 3***

| **Basis**                      | **Array**                                                                  | **Hash Map (Dictionary in Python)**                                              |
| ------------------------------ | -------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
| **Definition**                 | A collection of elements stored in a fixed sequence and accessed by index. | A collection of key‚Äìvalue pairs where each value is accessed using a unique key. |
| **Data Access**                | Accessed by **index number** (e.g., `arr[2]`).                             | Accessed by **key** (e.g., `marks["Math"]`).                                     |
| **Order of Elements**          | Elements are stored in a **specific order**.                               | Elements are **unordered** (order depends on hashing).                           |
| **Uniqueness**                 | Duplicate elements are allowed.                                            | Keys must be **unique**, but values can be repeated.                             |
| **Insertion / Deletion Speed** | Slower (O(n)) if done in the middle.                                       | Faster (O(1) average).                                                           |
| **Searching**                  | Slower (O(n)) unless index known.                                          | Very fast (O(1) average).                                                        |
| **Memory Usage**               | Uses less memory.                                                          | Uses more memory due to hash structure.                                          |
| **Best Use Case**              | When data order matters and you access by index.                           | When fast lookups are needed using meaningful keys.                              |
| **Example (Python)**           | `fruits = ["apple", "banana", "cherry"]`                                   | `prices = {"apple": 50, "banana": 20}`                                           |


***Answer 4***

A queue is a data structure that works on the FIFO (First In, First Out) principle ‚Äî the first item added is the first one to be removed.

* Working of a Queue (FIFO Principle)

1) FIFO (First In, First Out) means the element that enters first leaves first.

2) Imagine people standing in a line (queue) at a ticket counter:

3) The person who comes first gets the ticket first.

4) The person who comes last has to wait for others to finish.

----

| **Operation** | **Meaning**                                    | **Example**                             |
| ------------- | ---------------------------------------------- | --------------------------------------- |
| `enqueue()`   | Add an item at the **end (rear)** of the queue | A new customer joins the line           |
| `dequeue()`   | Remove an item from the **front** of the queue | The first customer is served and leaves |
| `peek()`      | Check who is at the **front** without removing | See which customer will be served next  |
| `isEmpty()`   | Check if the queue is empty                    | No customers in line                    |


----

| **Application Area**          | **How Queue is Used**                                                                       |
| ----------------------------- | ------------------------------------------------------------------------------------------- |
| **Customer Service Systems**  | Calls or tickets are handled in the order they are received ‚Äî first caller is served first. |
| **Printer Queue**             | Print jobs are processed one by one in the order they were submitted.                       |


**Practical Question**

**Day 3:** Structured, Semi-structured, and Unstructured Data

**Theory Question**

***Answer 1***

1. **Structured Data**

* Definition: Data stored in a fixed format, usually in rows and columns (like in databases or spreadsheets).

* Easy to search, process, and analyze because it follows a strict schema.

**Examples from real life:**
* Bank transactions (date, amount, sender, receiver).

* Student records in a school database (roll number, name, marks).

* Inventory details in a supermarket (item code, price, quantity).

2. **Semi-Structured Data**

* Definition: Data that does not strictly follow tabular format but still has some structure (like tags or hierarchy).

* Not as rigid as structured data, but easier to organize than unstructured.

**Examples from real life:**

* Emails (sender, receiver, subject are structured; body text is unstructured).

* JSON or XML files (data stored with tags/keys).

* Online product reviews (star rating = structured, review text = unstructured).

3. **Unstructured Data**

* Definition: Data with no predefined structure or organization, making it hard to store in tables.

* Typically requires advanced tools (AI/ML, text mining, NLP) to analyze.

**Examples from real life:**

* Photos and videos on social media.

* Audio recordings (like customer service calls).

* Free-form text (WhatsApp chats, essays, blog posts).

***Answer 2***

**Why is it challenging to process unstructured data?**

1. **No fixed format** ‚Äì Unlike tables, unstructured data has no rows/columns (e.g., images, videos, text).
2. **High volume** ‚Äì Social media posts, videos, or logs generate huge amounts of data every second.
3. **Variety** ‚Äì Data comes in many forms (text, audio, video, images), making it hard to standardize.
4. **Complexity of meaning** ‚Äì Human language, tone, emotions, and context are hard for machines to understand.
5. **Storage & retrieval** ‚Äì Traditional databases (like SQL) aren‚Äôt efficient for storing and searching unstructured data.

---

**Tools Commonly Used for Processing Unstructured Data**

**Big Data Tools**

* Hadoop ‚Äì Distributed storage and processing of large unstructured datasets.
* Apache Spark ‚Äì Fast, scalable data processing engine for big data analytics.

**Databases**

* MongoDB ‚Äì A NoSQL database that handles JSON-like semi/unstructured data.
* Cassandra ‚Äì Scalable database for unstructured and semi-structured data.

**Text & Natural Language Processing (NLP) Tools**

* NLTK, SpaCy ‚Äì For analyzing human language (sentiment, keywords, etc.).
* ElasticSearch ‚Äì For searching and analyzing large text datasets.

**AI/ML & Cloud Services**

* TensorFlow, PyTorch ‚Äì For image, video, and audio recognition.
* AWS (S3, Comprehend), Google Cloud AI, Azure Cognitive Services ‚Äì Cloud-based tools for unstructured data analytics.

---