# Data structure

One sentence description of data structures would be that they are basically different ways of storing data on your computer.

Every day, during our daily lives, we can see various types of data structures.

**Example 1**

![image.png](attachment:edeb6160-9f7b-4bd3-a803-2f07429ce3ec.png)

* Let's say we have a bunch of woods over here and it can be easily seen that they are not organized. 
* So if you want to choose a black wood from here, we face a problem.
* We need to check all of them one by one to find out the black ones.
* Since these items are not in an organized way, selecting black ones is time-consuming and difficult.

* Let's look at the organized way of these items.
* We have just taken the unorganized items from here and put them in an organized way.
* So selecting black wood from this list is very easy and it takes less time to find.

![image.png](attachment:ea30bdf0-e607-4ca3-94e7-2b41e189a573.png)

> *All application process data.*
> 
> *Before processing the data, we have to organize the data in certain way that makes the process very efficient.*
> 
> *From software application performance point of view, the efficiency and the performance of the software depends on how the data is stored, organized and grouped together during the program execution.*

**Example 2**

![image.png](attachment:a4d33b4e-a1cc-4ede-b96a-9e432160da34.png)

* Another example is, a crowd of people who want to get a ticket for the concert, but without an organized way, it becomes almost impossible to get a ticket.
* The organized way for people to get tickets is a queue, and this is also called queue data structure in computer science, which is the first-in-first-out method.

**Example 3**

![image.png](attachment:d4b31611-fb46-4c5d-9da7-916d3b0d9434.png)

* Another example is, imagine there are a bunch of books on the table and you want to return them to the library. It's obvious that without an organized way of ordering, we cannot carry these books.
* The organized way is to stack the books in a way so that it becomes easier to carry and this is also called stack data structure in computer science which is the first-in-last-out method.
 

### What is Data Structure?

* Data structures are different ways of **Organizing** and **Storing** the data so that it can be used efficiently. 
* They provide a systematic way to store and manage information, making it accessible and efficient to work with.
* Applications perform operations based on the given data. Before processing the data, we have to organize the data in a certain way that makes the process very efficient.
* From a software application performance point of view, the efficiency and performance of the software depend on how the data is stored, organized, and grouped during the program execution.

### Real-life use case

* Arranging the crowd of people so that tickets can be distributed in an organized way - **Queue** (FIFO).
* Arranging the bunch of books to return them to the library - **Stack** (LIFO).

There are many different data structures, but the problem is which one should we choose for the best performance of software or applications.

# Algorithms

![image.png](attachment:a9352af4-b209-49fa-9896-f12456180252.png)

If we want to use these wooden items over here for flooring, we need to accomplish several tasks.
* The first step is we have to choose the flooring, and which color we are going to use.
* Next, we need to purchase it and bring it home.
* Then, the third step is, we need to prepare some flooring.
* The next step is we need to determine the layout of the flooring in the space.
* And finally, we need to trim the door chasing and our floor is ready.

So as you see, we have completed a set of steps to accomplish our task of flooring.
* Our data is wooden items over here.
* First, we structured them as a data structure.
* Then we completed the steps, which is a set of rules as mentioned in the definition of the algorithm.
* So to make a flooring, we have to complete these five steps over here.
* So these steps together are called an algorithm, which is the set of instructions to complete a task.

***Every day during our daily lives, we use algorithms to execute our duties.***

![image.png](attachment:097ee4c1-ab5e-48dd-8508-710a1be54b0b.png)

![image.png](attachment:5d17a169-b693-4e9c-aed1-ddacc194ea2c.png)

![image.png](attachment:6fe63d90-e71c-4d20-9aae-8687d48f78ce.png)

In computer science, an algorithm is a set of rules for a computer program to accomplish a task. Learning about different types of algorithms and knowing when to apply them allows us to write time and memory-efficient programs.

![image.png](attachment:181a925a-e7cc-4a36-af22-aee7cd0e9db3.png)

**How do Google and Facebook transmit live video across the Internet?**

So the answer is they are using audio and video compression algorithms to transmit videos lively.

**Which algorithm is used to find the shortest path on the map?**

So graph algorithms are used in Google, Apple or Microsoft Maps to find the shortest path between two locations.

![image.png](attachment:f00b82d7-5902-4061-aa85-cd0e7f61fc75.png)

![image.png](attachment:1bba74ec-6ffc-4baa-8136-fef572be2c95.png)

Algorithms are used in space exploration as well.

**Optimization** and **scheduling** algorithms are used by NASA to arrange solar panels on the International Space Station.

![image.png](attachment:9f253a80-d184-459d-ab17-33c3dc5b9a19.png)

Getting deep knowledge about existing algorithms and applying the right one can make our programs faster.

It's also important to know how to design new algorithms as well as how to analyze their correctness and efficiency.

***What makes a good algorithm?***

There are two criteria to make good algorithms.
* **Correctness** → It solves the problem correctly
* **Efficiency** → It does it so efficiently.

# Types of data structure

![image.png](attachment:ef3b05e0-5cbf-4bb4-b0f0-fe9d2f96b5af.png)

In Python data structures can be classified as either **primitive** or **non-primitive**.

In Python, **primitive data structures** are basic data types that cannot be broken into simpler data types, whereas **non-primitive data structures** are more complex and can be broken down into smaller data types.

![image.png](attachment:794737e0-a4b2-4fd2-802a-9a5a2136590d.png)

## Primitive vs. Non-primitive Data Structures

![image.png](attachment:365e134c-a56b-4c20-a1ca-981576a5eac5.png)


## Linear Data Structures
* Linear data structures are those in which the elements are arranged in a sequential order with each element connected to its adjacent elements. 
* These data structures are used to represent a sequence of data where the order of elements is important.
* There are many linear data structures in Python, so these are **lists**, **tuples**, **arrays**, **linked lists**, **stacks**, and **queues**.

## Non-linear Data Structures
* Non-linear data structures are those in which the elements are not arranged in a sequential order.
* These data structures are used to represent a hierarchical relationship between data elements where each element is connected to one or more other elements in a specific way.
* **Sets**, **dictionaries**, **trees**, and **graphs** are non-linear data structures.

## Built-in and user-defined data structure

We can also subdivide these data structures into **built-in data structures** and **user-defined data structures**.

**Built-in data structures** are the ones that come with Python. We don't need to use any external library or create ourselves these data structures. 

They are built in Python.

* **linear data structures**: lists and tuples are built-in data structures.
* **non-linear data structures**: sets and dictionaries are built-in data structures.

Other data structures that we have over here are **user-defined data structures**. 

So this means that we need to use some external library or we can create them by ourselves to be able to use them.

## Why we need so many different types of data structures?

So as you can see, we have various types of data structures and you might be wondering **why we need so many different types of data structures.**

* Now the answer is each of these data structures has its own unique properties which work very efficiently in different circumstances. 
* For instance, 
    * graph data structures work perfectly for maps,
    * stack data structures work perfectly when you have back and forward buttons in your application due to their first-in-last-out nature.

# Types of Algorithms

There are many different types of algorithms that can be implemented in Python.

![image.png](attachment:363368aa-b8ce-416f-ae7c-93817030a1ae.png)

# Complexity analysis - a brief intro

Complexity analysis, often referred to as **“algorithmic complexity”**, is a way to measure how efficient an algorithm is. It helps us to understand how the algorithm’s performance changes as the size of the input data grows. One common way to express algorithmic complexity is by using something called **“Big O notation”**.

Now let’s break down the key concepts:

* **Algorithm**: This is a step-by-step procedure for solving a problem. In our example, sorting a list of numbers is a problem and an algorithm like quicksort is the set of instructions to do it.

* **Input size**: This is the amount of data your algorithm has to work with. For sorting algorithms, it would be the number of items in the list to be sorted. The input size can vary, and complexity analysis helps us to understand how the algorithm behaves as the input size grows.

* **Efficiency**: We want our algorithms to be efficient, which means they should use as less resources as possible such as time and memory. Complexity analysis helps us to quantify this efficiency.

# Big-O notation

![image.png](attachment:6cb03027-5ad0-4507-bc0f-e9881b23bfff.png)

Big O notation is a way to describe how the runtime (or memory usage) of an algorithm grows relative to the size of its input.

Here are a few common Big O notations and what they mean:

* **`O(1)` → Constant time**: This means the algorithm's runtime doesn't depend on the input size. It is like saying that **“no matter how many items you have to sort, it will always take the same amount of time”**.

* **`O(log n)` → Logarithmic time**: As the input size grows, the algorithm's runtime grows, but it doesn’t grow very quickly. It's like having a phonebook and being able to find a name quickly by splitting it in half repeatedly.

* **`O(n)` → Linear time**: The runtime of the algorithm grows linearly with the input size. If you double the input size, it will roughly take twice as long to run. It’s like checking each item in a list one by one.

* **`O(n^2)` → Quadratic tim**e: The runtime of the algorithm grows with the square of the input size. If you double the input size, it will take about four times as long to run. It’s like nested loops that iterate over all pairs of items in a list.

 

> **Note**:
> 
> **Big O** notation is a way to compare different algorithms quickly. you want to choose an algorithm with the smallest possible Big O complexity for your specific problem to make it run efficiently, especially when dealing with large amounts of data.


---

![image.png](attachment:ab026cdc-3c6c-4e26-b4a5-60988035e224.png)

# Linear data structures

* Linear data structures are those in which the elements are arranged in a sequential order with each element connected to its adjacent elements.
* That is, **Linear data structures** are arrangements of data elements where each element has a **unique predecessor and successor**, forming a **sequential order**.
* These data structures are used to **represent a sequence of data where the order of elements is important**.
* The following data structures in python are referred to as linear data structures:
    * Lists, Tuples, Array
    * Linked list
    * Stacks
    * Queue
    * Deque

**Array**

An array is a linear collection of elements with indexed access for efficient retrieval.

![image.png](attachment:a476da38-07f4-4f1c-94c4-112654150633.png)

**Linked list**

Linked list elements are connected by pointers allowing dynamic allocation and efficient insertion or deletion.

![image.png](attachment:2978bba8-d017-4a79-a726-8847b01beb4a.png)

**Stack**

Follows the Last-In-First-Out (LIFO) principle with top-based element manipulation.

![image.png](attachment:cd8c5474-6a17-4c0a-b910-94061e9ca5c7.png)

**Queue**

Adheres to the First-In-First-Out (FIFO) principle used for ordered processing.

![image.png](attachment:fe2b5338-4ac6-4461-af93-922a5cf887e4.png)

**Deque**

Supports insertion and removal at both ends offering enhanced flexibility.

![image.png](attachment:f378f829-ac09-4454-aad5-6d7cbc728f79.png)

## Need for linear data structure

* **Ordered Storage**: Linear data structures maintain a sequential order, which is essential for scenarios where data must be processed sequentially or accessed in a specific arrangement.
* **Efficient Access**: Direct indexing or traversal capabilities of linear structures allow for quick and convenient access to elements.
* **Insertion and Deletion**: Linear structures provide efficient methods for adding and removing elements, which is crucial for dynamic data manipulation.
* **Memory Optimization**: Linear structures allocate memory contiguously, optimizing memory usage and access efficiency.

## Operations on linear data structures

* **Access**: Retrieving elements by index, position, or pointer.
* **Insertion**: Adding new elements at specific positions.
* **Deletion**: Removing elements from specific positions.
* **Traversal**: Iterating through elements sequentially.
* **Search**: Finding the position or existence of an element.
* **Update**: Modifying the value of an element.
* **Sorting**: Arranging elements in a specified order.
* **Merging**: Combining two ordered linear data structures.
* **Memory management**: Allocating and deallocating memory dynamically.

## Real-world examples of linear data structure

* **Arrays**

    * **Grocery shopping list** → managing your shopping list with each item corresponding to an array index simplifies adding, removing, and checking off items.
    * **Image pixels** → in digital images, arrays store pixel values allowing manipulation and editing of pictures by altering individual pixel colors.


* **Linked list**

    * **Music playlists** → linked lists are suitable for creating playlists where songs are nodes connected in sequence allowing easy rearrangement and modification.
    * **Train cars** → A linked list can represent train cars linked together enabling efficient addition and removal of cars without affecting the entire train.


* **Stacks**

    * **Undo feature** → in software applications, stack manages to undo operations enabling users to reverse actions in the order they were performed.
    * **Plate stacking** → plates stacked on top of each other represent a real-world example of a stack where the last plate placed is the first one taken.


* **Queues**

    * **Cafeteria line** → Queues model waiting in line at a cafeteria where the first person in is served first maintaining order and fairness.
    * **Ticket counter** → waiting in line to purchase tickets like at a cinema or an event follows the queue concept.


* **Deques**

    * **Sliding glass doors** → Deques are similar to sliding glass doors at entrances allowing people to enter or exit from both sides.
    * **Printing and scanning** → Deques mimic the process of loading and unloading papers for printing and scanning as both ends are accessible.

# Non-linear data structure

Non-linear data structures do not follow a sequential order → each element can connect to multiple elements and thus form complex relationships.

The following data structures are considered non-linear data structures: **Tree**, **Graph**, and **Heap**.

**TREE**

A hierarchical structure with nodes connected by edges is commonly used for organizing hierarchical data like file systems.

![image.png](attachment:099bdaea-66ea-4839-aaf8-bf0bbd7a64ea.png)

**GRAPH**

A collection of nodes connected by edges allows versatile representation of relationships between various entities.

![image.png](attachment:9ee4c9bf-c983-4d5f-9d98-b1f01ee2303f.png)

**HEAP**

A specialized tree-based structure that satisfies the heap property and is often used in priority queues and memory allocation.

![image.png](attachment:55d10404-2b88-41b9-82a5-0a688e5700a6.png)

## Need for non-linear data structure

* **Complex relationships** → Non-linear structures represent intricate relationships between elements suitable for scenarios with intricate connections.
* **Hierarchical organization** → Trees organize data hierarchically making them ideal data structures like company organization charts.
* **Graph-based modeling** → Graphs enable the modeling of various real-world networks from social media connections to computer networks.
* **Efficient priority management** → Heap efficiently manages priority-based operations such as extracting the highest-priority element.

## Operations on non-linear data structures

* **Tree traversal** → Navigating trees using methods like in-order, pre-order, and post-order traversal to explore hierarchical data.
* **Graph traversal** → Exploring graphs through traversal algorithms like breadth-first search (BFS), and depth-first search (DFS).
* **Heap operations** → Performing heap-specific operations like insertion, deletion, and heapifying to maintain the heap property.
* **Balancing** → Ensuring balanced trees like AVL or Red-black trees to maintain efficient search and insertion operations.

## Real-world examples of non-linear data structures

**TREE**
* **File system** → organizing files in a hierarchical structure mirrors the tree concept with directories as nodes and files as leaves.
* **Family genealogy** → represents family relationships like a family tree which illustrates the hierarchical nature of tree structure.

**GRAPH**
* **Social networks** → Social media platforms model users and their connections using graph structures to facilitate friend suggestions.
* **Road networks** → Maps utilize graphs to represent roads and intersections helping navigation systems to find the shortest routes.

**HEAP**
* **Priority queues** → A hospital’s patient queue can be modeled using a heap with patients ordered by priority for efficient treatment.
* **Memory allocation** → The memory heap in programming languages allocates memory dynamically utilizing the heap data structure principle.

# Choosing the right data structure

## Factors influencing data structure selection

**Data characteristics**
* Analyzing the type, size, and nature of the data is crucial to align the data structure's properties with the data itself.
* For instance, choosing an array for fixed-size data or a linked list for dynamic data with frequent insertions/deletions enhances efficiency.

**Operations and usage**
* Identifying primary operations such as search, insertion or deletion clarifies the essential functionality that the data structure must provide.
* Evaluating the frequency of each operation guides the selection towards structures optimized for most of the common tasks improving overall performance.

**Memory and storage constraints**
* Evaluating memory availability and storage needs to ensure efficient resource utilization and prevent unnecessary waste.
* Opting for memory-efficient structures like bit arrays or compressed data representation is essential for a system with limited memory resources.

**Access patterns**
* Recognizing access patterns whether sequential, random, or specific will help to choose the data structure to align with the expected data retrieval method.
* For instance, using arrays for sequential access or hash tables for quick random access based on keys enhances access efficiency.

**Algorithmic requirements**
* Considering algorithms that will interact with the chosen structure ensures compatibility and efficiency.
* For example, when planning to implement the graph algorithm, selecting data structures like adjacency lists can simplify traversal and manipulation tasks.


## Common data structure selection scenarios

* **Search operations**: Comparing the efficiency of search operations in arrays, hash tables, and binary search trees.

* **Insertion and deletion**: Analyzing the trade-offs between array-based data structure and linked structures for dynamic insertions and deletions.

* **Ordered data**: Discussing when arrays or balanced trees are preferable for maintaining ordered data.

* **Memory efficiency**: Exploring compact data structures like bit arrays or bloom filters for memory-constrained environments.

* **Frequent access**: Addressing scenarios where hash tables or self-balancing trees are suited for fast scenarios.