### What, Why, How towards DSA: 

1. First, what is DSA (Data Structures and Algorithms)?  Why is it important? What problems does it solve, and why is it different from regular coding? 

You might already know coding, but this special term DSA—why is it used? Why not just call it Python coding? We’ll explain why DSA is different from regular coding.

2. Next, we’ll cover why DSA is required?   You might wonder why, especially if you've never heard of a data analyst or data science interview requiring DSA skills. We need to deeply understand why we are preparing for this.

3. Lastly, we’ll discuss how to approach this course effectively. How will it help you?   What will you need to do, and what will I be doing? I’ll explain how the course is structured in this section.



- Now, let’s start with the first topic:

 

### What is DSA? Data Structures and Algorithms. 

- Instead of going over some boring theoretical concepts, let’s do a fun and intuitive activity. 

- We all know Netflix, right? 
    - It’s a repository of over 50,000 movies, or maybe even more, I don’t know the exact number. So, let’s imagine Netflix as a list of movies: Movie 1, 2, 3, and so on up to 50,000. 

    - Now, let’s say I want to find a movie called **Race** —a famous Bollywood movie from 2010. We’ll approach this in two ways.

        - 1. (Without Search Bar) - let’s search for the movie without using the search bar. Usually, on Netflix, you’d just type the movie name in the search bar and find it, but let’s skip that and manually browse Netflix to find the movie. Will use a stopwatch to track how long this takes.
        - 2. (With Search Bar)

- Let’s start stopwatch. We’ll browse through Netflix website, looking for *Race* **without using the search bar.** 
    - There are so many movies, so let’s start by checking the top 10 in the US today. It’s not there. 
    - Let's try action and adventure, as that’s the category *Race* would likely be in. Still no luck. Scrolling through the list, checking one movie at a time. 
    - It’s not in comedy either or even in top searches, so skip that. 
    - In movies, Can filter in Genres but not allowed to do.
    - Inside Hindi Movies & TVs - NO
    - Its not a recent Movie, so cant be found in the recent list. 
    - And still, no luck even after 1 minute and 43 seconds even after applying filters.
    


Now, stop the stopwatch. Spent 1 minute and 52 seconds, and  - -
- Still didn’t find the movie because it’s an older one. 
- Without knowing where it is in the list, it’s a long and tedious process. 
- If Netflix has 50,000 movies, would have to check them one by one till the end, until find *Race*. 

![Screenshot%202024-09-29%20at%205.13.35%20PM.png](attachment:Screenshot%202024-09-29%20at%205.13.35%20PM.png)



Now, let’s reset the timer and **use the search bar.** 
- Type "Race" into the search bar, and in a few seconds—there it is, the movie looking for. 
- It took only 19 seconds. 

        @ Without the search bar, it took nearly two minutes with no result. 
        @ With the search bar, it took just 19 seconds. 

- This is the difference data structures and algorithms make.

Without search Bar 
- if Netflix has 50,000 movies - and if checking 10 movies/seconds.. total it will take 5000 seconds(maximum). it will take less time if found in between.

With Search Bar
- took 5 seconds

When you **use a search bar**, 
- in the backend, it creates a binary list, 
    - ranks movies, and 
    - delivers faster results. 

- We saw that it was the second movie in that particular list. 
- So, this is simply a DSA (Data Structures and Algorithms) concept here. 

![Screenshot%202024-09-29%20at%205.49.25%20PM.png](attachment:Screenshot%202024-09-29%20at%205.49.25%20PM.png)


Now, coming to DSA, it involves 
  - @ a lot of operations, especially those related to technology and daily tasks. 
  - @ Every process can be time-consuming, and 
  - @ *DSA helps to solve problems in the most efficient way.*

- If someone asks what DSA is,
    - it's the most efficient way to code anything. 
    - Essentially, DSA is about finding the best way to code with the 
        - least time and 
        - space.

First, we need to understand how DSA helps us. 

#### When there's a lot of data in the backend, how can we streamline the operation to minimize time? 

DSA addresses two main aspects: 
1. time complexity and 
2. space complexity. 

- Time complexity refers to - how long a process takes, while 
- space complexity refers to - the storage required. 

In the modern era, when everything is shifting to the cloud and you're paying for storage, 
- reducing time and space complexity becomes crucial because it reduces costs. 
- for every search iteration - you have to pay money.

For example, if you're building an AI model, DSA plays a significant role. 

- Suppose you're creating a chatbot, and
- a user asks a question. 
- The AI has to answer it by 
    - querying a database of for eg - 10,000 rows and 50 columns to 
    - return the top 10 rows. 
    - providing this database to the LLM and asking it to answer the question using the database. 
    - AI part is done by LLM.
    
    
![Screenshot%202024-09-29%20at%206.06.13%20PM.png](attachment:Screenshot%202024-09-29%20at%206.06.13%20PM.png)

- Taking question from user
- passing the question to LLM
- asking the LLM to use database to respond to the question
- answer the user


Technically human isn't doing anything. The AI part is handled by LLM.

Without optimization, 
- the AI (or LLM model) would handle everything,
    - consuming lot of time and 
    - lot of space. 
    
However, as an AI engineer, your job is to reduce 
- the LLM's task,  AI's task and 
- increase the human workload to save on space and cost.
(eg - LLM was doing 100% and human doing 0% ..... aleast increase human work to 80% atleast and reduce dependency on AI models -- i.e space --- whem LLM calling and API calling is reduced -> The cost will reduce.)

The goal is to find 
- the most effective way to reduce 
    - time and
    - storage, 
    (which is where DSA comes into play.)`m

DSA offers multiple algorithms to solve problems, including 
1. linear search(Netflix example) , 
2. two-pointer algorithm, 
3. binary search, 
4. greedy algorithms, 
5. dynamic programming, 
6. hash tables, and 
7. tree algorithms like - 
    - depth-first search (DFS) and 
    - breadth-first search (BFS). 

The data structures we'll work with include 
1. arrays, 
2. lists, 
3. dictionaries, 
4. tuples, 
5. sets, 
6. linked lists, and 
7. hash maps. 

These  help reduce time and space complexity.

To illustrate the importance of DSA, consider this example:

## Finding the first non-repeating character in a string like "aabbcdde". 

In this case, the answer would be "c" as it's the first non-repeating character. NOT 'e' its the next.


 There are different methods to solve this, 
 - including brute force, 
 - hash maps, and 
 - two-pointer sliding window techniques. 
 
 Each method gives the same answer, but DSA helps identify which method is most time-efficient.

- DSA is finding the most optimal way

#### Method 1 : Brute Force

In [3]:
def first_non_repeating_brute_force(s: str) -> str:
    n = len(s)
    for i in range(n):
        found_duplicate = False
        for j in range(n):
            if i != j and s[i] == s[j]:
                found_duplicate = True
                break
        if not found_duplicate:
            return s[i]
    return None


## usage
print(first_non_repeating_brute_force("aabbcdde"))

c


#### Method 2 :

here 4 lines of code comapred to the above.

In [5]:
from collections import Counter

def first_non_repeating_hashmap(s: str) -> str:
    freq = Counter(s)
    for char in s:
        if freq[char] == 1:
            return char
    return None


## usage
print(first_non_repeating_hashmap("aabbcdde"))

c


#### Method 3 : Using an ordered Dictionary (Preserve Order)

In [8]:
from collections import OrderedDict

def first_non_repeating_ordered_dicts(s: str) -> str:
    freq = OrderedDict()
    for char in s:
        freq[char] = freq.get(char, 0) + 1
    for char, count in freq.items():
        if count == 1:
            return char
    return None


## usage
print(first_non_repeating_ordered_dicts("aabbcdde"))

c


#### Method 4: Two - Pointer Sliding Window

In [11]:
def first_non_repeating_two_pointer(s: str) -> str:
    freq = {}
    left = 0
    
    for right in range(len(s)):
        freq[s[right]] = freq.get(s[right], 0) + 1
        while left <= right and freq[s[left]] > 1:
            left += 1
    return s[left] if left < len(s) else None


## usage
print(first_non_repeating_two_pointer("aabbcdde"))

c


In interviews, candidates often use brute force because it's intuitive. 

However, interviewers expect you to 
- reduce time complexity and 
- solve the problem more efficiently. 

This requires a solid understanding of various DSA methods and practice, which is essential for mastering DSA.

(eg - Brute Force method is calling too many variables.... can it be reduced??? 
- should know these methods.
- strong practice base
- more extensive practice)

###### What are the other major reasons why DSA is required? 
- We already know that it reduces time complexity and 
- space complexity. 
- It’s the most efficient way of coding, and that’s why it’s often asked in interviews.


#### But let's look at it from an interviewer’s perspective. 

Imagine you're an interviewer tasked with hiring someone. In 2018 or 2019, a person with a basic resume and two or three decent projects who knew basic coding would likely get selected. But now, everyone’s resume looks top-notch, and it’s hard to tell who’s truly skilled.

Resumes, used to hold more weight, but now they are sometimes overrated. People can add anything to their resumes, like projects they got from ChatGPT or similar tools, even if they haven’t truly worked on them. As an interviewer, I can’t always tell if a candidate is genuine. That’s why companies need ways to test your credibility—like how fast you can think or code. For example, consulting firms like McKinsey or Bain often conduct case study rounds before anything else.

Nowadays, resume weight is only about 2 out of 10, while analytical skills and coding ability are much more important. Analytical skills are often tested through case studies, and coding ability is tested through platforms like HackerRank or LeetCode. No matter what an influencer says, in a real interview, if you can’t code or solve problems on the spot, you won’t be selected.

prompt - "Give me top notch level projects that I can add in my resume, write it in details. Dont give me basic projects, give me something that makes me standout and good in line with Big company's standards."

AI advancements have increased expectations for coding, as companies can’t always tell who is faking their qualifications. This is why data science and engineering roles now often require DSA knowledge, which wasn’t the case in earlier years. Almost every company now has DSA coding tests as the first round.

In summary, there are three main reasons why DSA is crucial: 
1. It’s frequently asked in interviews. 
2. It helps you handle the kind of work expected in companies.
3. AI advancements and the growing competition mean that coding ability is more important than ever.

Hope this explanation clarifies why DSA is so important. If you have any more questions, feel free to ask. This course is going to help you, and I’ll explain the format and structure of it. 



Case studies will always be something you haven’t done before. For example, they might say, “Let’s assume you have opened a new food restaurant, and you are seeing a decline in sales over the last six months.” Imagine you have a restaurant, and its sales were stable from 2020 to 2023, but over the past six months, the revenue has declined. They will ask you to figure out the reason for the decline and how you would investigate it.

Now, if you can't answer this question, to be honest, you probably won’t get selected, no matter what your resume says. This is crucial because what’s being tested here is your general analytical ability—can you think critically? This is very important.

In terms of importance, I would rate analytical skills 4 out of 10. Once you've cleared this part, then comes the coding. If your role involves coding, the remaining 4 goes to coding skills, and your resume, projects, and past experience account for the remaining 2. In reality, your coding ability in live scenarios weighs much more than the information on your resume. You could have done amazing projects in the past, but if you can't code when needed, you won't be selected.



Many influencers online talk about various aspects of recruitment, but some don't really know what the current process looks like. The expectation from candidates has risen, especially with the advancement of AI, because companies can't easily tell who is faking their skills. That's why coding tests like those from HackerRank or LeetCode have become common for roles, including data science.

#### Reason 3 - WHY?

Additionally, companies often give you existing code rather than expecting you to build everything from scratch, unless you're working on proof-of-concept projects. So, it becomes essential to **quickly understand and optimize existing models or code.** Even using tools like ChatGPT for advanced coding problems only works about 10-15% of the time. For the foreseeable future, coding skills remain vital if you want to succeed in data analytics, engineering, or science roles.

If you’re serious about cracking interviews, especially in these fields, focus on mastering coding and analytical thinking. These two skills are non-negotiable.