<img src="C:/Users/shrav/Downloads/neuracamp_logo.png" alt="NeuraCamp Logo" width="150" height="100"/>

# Python for Data Science - Beginner
Author: [Shravan Khunti](https://www.linkedin.com/in/shravan-khunti)

# Lesson 1: Welcome & Setup

<em> Some courses start with corporate poetry about “AI transformation pipelines” and “leveraging data ecosystems.” Not here. We’re going straight to what matters. </em>

Python for Data Science is **awesome**. Not "awesome" in a bland, corporate way – I mean actually game-changing, fun, and yes, extremely valuable.

In this first lesson, we'll set the stage for your journey from newbie to data ninja. Buckle up, because we're about to dive into: why Python is the go-to language for data science, how companies use data science (and why they *love* hiring people who can do it), what this course will cover, and how to get your environment set up.

By the end, you'll even run your very first Python code.

Sound good? Great, let's get started!

# Why Python for Data Science Rocks

Python is the **MVP of data science** – the Most Versatile Player. Sure, there are other tools and languages out there (shoutout to R, Excel, SQL, etc.), but Python hits the sweet spot. It's easy to learn *and* incredibly powerful. You can use Python to wrangle a million-row dataset before your morning coffee, automate boring tasks, build machine learning models, or even deploy a web service. In other words, Python lets you do *everything* from quick data tweaks to building the next big AI app.

Companies know this, and they have practically adopted Python as the unofficial language of data science. Why? Because it gets the job done. **Fast.** Need to analyze user behavior from a mobile app? Python can do that. Want to prototype a machine learning model to predict sales? Python’s got your back. From **startups to Fortune 500** giants, everyone uses Python. Google, Netflix, Spotify, Instagram, **NASA** – you name it – they all use Python in some part of their tech stack. In fact, walk into any data science team’s daily stand-up meeting and you'll hear Python all over the place:

> "I wrote a script to process those logs..."  
> "Let's use this Python library to visualize the results."

It's everywhere in industry, powering **data-driven decisions** and crazy cool innovations.

---

But let's be real: one big reason Python rocks is **what it can do for *you*.** Learning Python (and data science) can open doors to exciting projects *and* high-paying jobs.  

> Data Science has been called *"the sexiest job of the 21st century,"* and while we chuckle at the phrasing, it’s true that skilled data folks are in **hot demand**.

Companies are desperate for people who can turn their mountains of data into actionable insights. Whether it's helping a retailer figure out what product to stock up on, or enabling a healthcare company to predict disease outbreaks, data skills are driving real impact – and employers pay top dollar for them.

So if you master Python for data science, you're not just picking up a new hobby – you're investing in a valuable career skill.  
***(Cha-ching!)***

### Python Trivia That Actually Shows Up in Interviews

Some recruiters sneak in Python history or trivia to see if you've done your homework. Here's the good stuff you should know:

**Who created Python?**  
[Guido van Rossum](https://en.wikipedia.org/wiki/Guido_van_Rossum), in the late 1980s while working at [CWI](https://en.wikipedia.org/wiki/Centrum_Wiskunde_%26_Informatica) in the Netherlands.

**Why is it called Python?**  
It’s named after *Monty Python’s Flying Circus* (not the snake) — Guido wanted a name that was fun and unique.

**When was it released?**  
First official version (Python 0.9.0) dropped in 1991.

**What kind of language is Python?**  
- **Interpreted**:  
  You don’t need to “compile” your Python code like in some other languages. You just write it and run it directly. Think of it like writing a note and immediately reading it out loud — no extra steps.

- **High-level**:  
  Python is close to human language. You don’t have to worry about computer memory, hardware stuff, or writing 10 lines just to do something simple. You can focus on logic and problem-solving.

- **Dynamically typed**:  
  You don’t have to say what type of data (like number or text) you're using — Python figures it out. For example, just write `x = 5` and Python knows it's a number.

- **Supports object-oriented, procedural, and functional programming**:  
  Python gives you multiple ways to write your code:
  - **Object-oriented**: Organize code into reusable “objects” (like mini-programs).
  - **Procedural**: Write a step-by-step list of instructions (like a recipe).
  - **Functional**: Treat your code like math functions — clean, modular, and reusable. 

**Why do companies love it?**  
- Simple, readable syntax  
- Huge libraries for data and ML  
- Jupyter Notebooks = easy experimentation  
- Works well with SQL, APIs, cloud tools  
- Used at Google, Netflix, Meta, Amazon — everywhere

---

Don’t worry if this felt a bit jargon-y. By the end of this course, you’ll understand (and casually drop) all these terms like a pro.

# What *Is* Data Science (and Why Do Companies Care)?

**Data Science** is a fancy term for **extracting insights from data**. That’s it, plain and simple. It’s what happens when you take raw information (like sales numbers, web clicks, sensor readings, you name it) and **analyze** it to find patterns, answer questions, or make predictions.

Think of data scientists as detectives, but instead of solving crimes, they're solving business problems using data clues. They clean and examine datasets to discover trends ("Hey, our users in Europe are spending more on weekends"), build models to predict future events ("Which customers are likely to cancel their subscription next month?"), present those insights in a compelling way ("Here’s a neat graph showing our growth projections"), and sometimes even uncover hidden patterns that make you go “wait, what?” like how Walmart discovered that strawberry Pop-Tarts sales spike sevenfold before hurricanes because, obviously, that’s the top survival food! ([check out a short case study on it here](https://www.snowdatascience.org/post/how-data-science-helped-walmart-predict-sales-during-a-hurricane))

---

Why do companies care so much about this?  
**Money and competitive edge.** In today’s world, data is *gold*. Companies that harness their data effectively can outsmart and outrun those that don’t.

For example:
- An e-commerce company might use data science to personalize recommendations – ever wonder how Amazon seems to know what you want to buy next?
- A logistics company might use it to optimize delivery routes and save on fuel costs – which is probably why your Amazon Prime order shows up faster than your regrets.
- Social media platforms use data science to keep you engaged (or addicted 😅) by showing content you're likely to interact with.

Essentially, data science helps organizations make **informed decisions** rather than gut feelings. Whether it’s reducing operational costs, improving customer experience, or identifying new market opportunities, data-driven decisions give companies a **big advantage**.

They care about data science because it translates to tangible business value – and no one likes leaving money on the table.

---

From a career perspective, this is great news for *you*. If you can speak the language of data – and that’s exactly what Python will enable you to do – you're an instant MVP in those companies.

You'll be the person who can **back up ideas with facts and figures**, who can build a quick prototype to test a hunch, or dig into the database to answer that weird question your manager just thought up. When everyone’s tossing around opinions in a meeting, you’ll be the one bringing the *data* — and in today's world, **data >>> opinion**, every single time. It’s a powerful position to be in.

In short, **data science is impactful**:  
- It’s how Netflix decides what show to produce next  
- How hospitals identify at-risk patients  
- How banks detect suspicious activity and lock your card while you're still on the checkout page 😭 

By learning it, you’re basically learning how to read the story hidden in piles of numbers – a story every company wants to hear.


# What You’ll Learn in This Course (Overview)

Before we jump into setup details, here's a quick roadmap of where we're headed. We’ve structured the course in **three levels** – Beginner, Intermediate, and Advanced – to take you from zero to hero in data science with Python. After completing all three levels, I’m pretty sure you’ll be ready to crack that FAANG, MAANG, MANGA — oh god, whatever acronym you’re following these days. Each module has a different vibe and goal, and yes, you will be doing a lot of coding! Yes, coding! This is not a blog where you just sit back and read. You can only become a solid data person or programmer if you build the habit of actually writing code. That is why I have designed NeuraCamp so every topic, lesson, and module comes packed with hands-on exercises inside Jupyter notebooks and coding terminals hosted by us. And yes, it is completely free to use!:


---

### • Beginner Level:
We assume you **know nothing** about coding or Python. Everyone starts somewhere!

In this part of the course, we'll cover Python basics from the ground up. You’ll learn how to use Python like a calculator, manipulate text, and work with fundamental data structures (lists, dictionaries, etc.).

We’ll introduce you to **Jupyter Notebooks**, an awesome tool for writing and running code interactively (more on that soon). By the end of the beginner module, you'll be comfortable writing small programs, using libraries like **NumPy** and **Pandas** to play with data, and even loading real datasets.

In short, we'll get you from "Hello, World!" to actually exploring data in Python.

---

### • Intermediate Level:
Now things get **interesting**.

This part is all about becoming a **productive data analyst/scientist**. You’ll dive deeper into **data wrangling**: cleaning messy data, filtering and grouping data to answer specific questions, and merging datasets together.

We’ll emphasize **exploratory data analysis** – basically, slicing and dicing data to uncover insights. Expect to create your first **visualizations** with Python libraries like **Matplotlib** and **Seaborn** (because a picture is worth a thousand rows of data).

We'll also introduce the fundamentals of **machine learning** – yep, by the end of this module you'll build and evaluate your first predictive models (think linear regression and simple classifiers).

And because real-world data often lives in databases, you'll even learn how to use **SQL inside Python** to pull in data.

This level ties together programming and analysis skills, preparing you to tackle real business questions.

---

### • Advanced Level:
Time to level up to **data science ninja**.

The advanced module pushes you into the realm of cutting-edge techniques and larger-scale data. We'll cover more complex machine learning models like:
- **Decision Trees**
- **Random Forests**
- **Boosting Algorithms**

These often win Kaggle competitions and impress employers.

You’ll practice techniques for **model tuning and evaluation** – making sure your models aren’t just accurate, but also robust and reliable. We’ll also explore **unsupervised learning** (like clustering customers into segments when you don't have labeled data).

Let’s not forget the buzzwords: you’ll get a taste of **deep learning & AI** with neural networks, and learn how to handle **Big Data** using tools like **PySpark** (because not all data fits in an Excel sheet or even in Pandas!).

Finally, we’ll teach you how to gather data from the modern web – using **APIs** and **web scraping** – so you can fuel your projects with interesting datasets.

By the end, you’ll run through an **end-to-end project** pulling together everything: data acquisition, cleaning, analysis, modeling, and even presenting results.

In other words, you'll be ready to handle real-world data science tasks from scratch. 🎓

---

Also, we’ve got **plenty of exercises and challenges** inspired by actual interview questions and real-world scenarios. As you progress, you won’t just passively read or watch — you’ll actively **do**. I highly recommend jumping into the problems once you're ready — they’ve been carefully curated based on real interviews, so they mirror the kind of questions top companies actually ask.

If you’ve ever used platforms like LeetCode, HackerRank, or CodeChef, you’ll feel right at home — but here, it’s all integrated into your learning flow. Practicing regularly is one of the most effective ways to build muscle memory for coding interviews, especially at competitive high-impact companies.

By the time you're through, you’ll have built a portfolio of code and mini-projects that you can even show off to potential employers. The journey from beginner to end isn't intense but also really fun – and we (the NeuraCamp team) will be cheering you on every step of the way!

# Setting Up Your Python Environment 🛠️

Alright, enough talk – let’s get your tools set up so you can start coding. For this course, you’ll primarily use **Python** and **Jupyter Notebook**, plus a few other essentials. Don’t worry if you’ve never installed software or used a terminal before; we'll walk you through it.

Setting up your environment is kind of like prepping your kitchen before cooking: a bit of prep makes everything smoother and more enjoyable.

---

### • Python Installation:
If you don’t have Python on your computer yet, the easiest way to get it (especially for data science) is to install **Anaconda Distribution**.

Anaconda is a one-stop shop – it bundles:
- Python itself
- A ton of useful libraries and tools (including Jupyter Notebook)

📦 Just go to the [Anaconda website](https://www.anaconda.com/products/distribution), download the latest Python 3.x version, and run the installer. It’ll handle the heavy lifting.

> *(If you already have Python installed or prefer a leaner setup, you can install Jupyter separately by running `pip install notebook` or `pip install jupyterlab` in your terminal. But if that sentence sounded like gibberish, stick with Anaconda for now!)*

---

### • Jupyter Notebook:
Once Python (and Jupyter) are installed, it’s time to launch your new playground.

**Jupyter Notebook** is an interactive coding environment that runs in your web browser. Don’t let the “web” part confuse you – it’s all running on *your own computer*, just using the browser as an interface.

To start Jupyter:
1. Open your **Terminal** (or Command Prompt on Windows).
2. Type:
   ```bash
   jupyter notebook
   ```
3. A browser window should open, showing the Jupyter interface.

> It usually starts at a directory (folder) on your computer, and you can navigate or create notebooks from there.

---

### • Creating Your First Notebook:
In the Jupyter interface:
- Click the **New** button
- Select **Python 3** (or similar)

This opens a fresh notebook (usually named `Untitled.ipynb`).

You’ll see a place to write code (a cell) and some menus at the top.

🎉 **Congrats – you now have a working Jupyter Notebook ready to execute Python code!**

> If you get stuck on any step (installation, launch, etc.), don’t panic. We’ve all been through tricky installs — it’s worth it once you're up and running.

---

# A Note on Terminal, Scripts, and Git

As a budding programmer/data-scientist, you’ll also encounter a few more tools and concepts as you go.

---

### • Terminal:
The terminal (aka command line) is a text-only interface to control your computer.

It may feel old-school, but it’s:
- Powerful
- Fast
- A **must-know** skill for developers and data scientists

You’ll use it to:
- Launch notebooks: `jupyter notebook`
- Install packages: `pip install <package>`
- Navigate folders: `cd foldername`

**Windows:** Use Anaconda Prompt or PowerShell  
**Mac/Linux:** Use the built-in Terminal app

Think of it as a secret language you’ll get fluent in over time.

---

### • Scripts vs. Notebooks:
We’ll use **Jupyter Notebooks** in this course — great for learning and documenting your code.

But you may also hear about **Python scripts** — plain `.py` files.

You run a script like this:
```bash
python myscript.py
```

Scripts are useful when:
- Automating tasks
- Finalizing deployable programs
- Reusing code across projects

For now, notebooks are home base 🏡 — we’ll gradually move to scripts as you grow.

---

### • Git:
Ever wish you could undo file changes or collaborate without messy versions?

**Git** is your savior. It’s a:
- Version control system
- Code history tracker
- Collaboration tool

Even if you’re solo, Git is a great habit.

To check if it’s installed:
```bash
git --version
```

If not, download from [git-scm.com](https://git-scm.com/) and follow their setup guide.

You’ll learn the basics as we go — enough to manage code like a pro. 💪

---

# Your First Python Code

Enough setup talk – let’s write some code! It's time to make sure your environment is working and get a taste of Python.

A time-honored tradition in programming is to start with a **"Hello, World!"** example, which simply prints out that friendly greeting. It may sound trivial, but getting this to work means you've successfully set up Python and Jupyter. Plus, it's oddly satisfying to see the computer do something *you* told it to do.

So let’s do it.

---

### Step-by-Step:
1. **Open your new Jupyter Notebook**  
   (or if you closed it, launch it again via:

   ```bash
   jupyter notebook
   ```

2. In the first cell of the notebook, type this code:

   ```python
   print("Hello, World!")
   ```

3. **Run the cell** by:
   - Clicking the “Run” button in the toolbar (▶️)
   - Or pressing `Shift + Enter`

If everything is set up correctly, your notebook should output:

```
Hello, World!
```

---

### What just happened?
- `print()` is a Python function that tells the computer to display whatever is inside the parentheses.
- `"Hello, World!"` is a string of text, so Python printed it below your code cell.

> 🎉 This confirms your Python + Jupyter environment is working!

---

### Why it matters:
This little win marks the beginning of your data science journey.

It may just be three words today, but soon you’ll:
- Clean and analyze large datasets
- Build visualizations
- Train models
- Write powerful, reusable scripts

And it all begins with:
```python
print("Hello, World!")
```

---

Take a moment to appreciate this:  
**You just wrote a program in Python!**  
It might be the simplest program possible, but it’s a *real* one. So give yourself a small celebration — you earned it.

Now let’s build on it, step by step. 🚀

# Lesson Summary & Cheat Sheet

That was a lot of information, so let's **summarize** the key points from this welcome lesson. Think of this as your quick-reference cheat sheet for Lesson 1:

---

### • Python & Data Science are a Perfect Match:
Python is a beginner-friendly yet powerful programming language widely used in data science. Companies love it because it can do everything from data cleaning to building AI models. In short, Python is the *go-to tool* to get data science work done efficiently. If you learn it well, you become instantly more valuable in the job market.

---

### • Data Science = Turning Data into Insights:
It's all about extracting meaning from data to solve problems or make decisions. Companies care because these insights can save money, boost profits, or create new opportunities. As a data scientist, you’ll play detective with data – a skill that’s in high demand.

---

### • Course Structure:
This course will guide you through **Beginner, Intermediate, and Advanced** levels.

- **Beginner module**: You’ll learn Python basics (variables, data types, lists, dictionaries, etc.), and by the end of it you’ll be handling real datasets with Pandas.
- **Intermediate module**: Builds up your data analysis and visualization skills, and introduces machine learning fundamentals.
- **Advanced module**: Covers deeper machine learning, big data tools, and a bit of AI.

Each step has hands-on exercises and real-world inspired problems to solve.

---

### • Setup Recap:
You installed Python (hopefully via Anaconda for convenience) and **launched a Jupyter Notebook**.

To start Jupyter, use the terminal and run:
```bash
jupyter notebook
```

In Jupyter, you create a new notebook and write code in cells.  
We also introduced the **Terminal** and **Git** as powerful tools you'll grow comfortable with.

---

### • Your First Code:
You wrote and executed:
```python
print("Hello, World!")
```
in a notebook. This verified that your environment is working.

The `print()` function is a basic way to output information and is great for testing and displaying quick results. Executing a cell with `Shift + Enter` runs the code and shows output instantly.

---

### • Cheat Sheet - Basic Commands:

- **To run a cell in Jupyter**: click the cell and press `Shift + Enter` (or use the Run ▶️ button)
- **To print output in Python**: use `print(your_message)`
- **To open the terminal**:
    - **Windows**: Anaconda Prompt / PowerShell
    - **Mac**: Terminal app
    - **Linux**: Terminal

- **To launch Jupyter**: type `jupyter notebook`
    - Press `Ctrl+C` in the terminal to stop the server

- **To check Python install**:
    ```bash
    python --version
    ```

- **To check pip install**:
    ```bash
    pip --version
    ```

- **To verify Git install**:
    ```bash
    git --version
    ```

---

## • Troubleshooting Tips:
If "Hello, World!" didn’t work, or you hit an error launching Jupyter, double-check that Python/Anaconda is installed correctly. Make sure you’re running the command in the right place (e.g., the Anaconda Prompt on Windows). Error messages might look scary, but they often hint at the problem. And remember, Google (or our community forum) is your friend – whatever issue you run into, chances are someone else has had the same problem and posted a solution online. Don’t hesitate to search for help.

---

That wraps up the first lesson!  
You’ve learned why Python and data science are worth your time, how they're used in the real world by companies, what exciting things lie ahead in this course, and you set up your environment and ran your first Python code. Take a moment to celebrate that progress.

In the next lesson, we’ll start digging into Python fundamentals (you'll discover how to use Python as a calculator, among other things).  
**Tip:** Between lessons, try tinkering a bit more in your Jupyter Notebook – maybe change the message in the `print()` or create another cell and do simple math like `5 + 3`. Getting comfortable with the interface now will make the upcoming lessons even easier.

---

**Welcome once again to the world of Data Science with Python – let’s have some fun and learn a ton!**

## Python: Your Overqualified Calculator

Before we conquer data science case studies, let's see Python do what it was originally **born** to do: math! Think of it as a super-calculator that can also eventually send rockets to Mars (thanks, SpaceX engineers), but one step at a time…

In a Jupyter Notebook code cell (or an interactive Python shell), you can perform arithmetic just like on a calculator 💡. No, seriously – type an equation and Python gives you the result. For example:

In [6]:
# Let's do some basic math
2 + 2

4

Run that, and you'll get output `4`. Thrilling, right? Python supports the usual suspects:
- `+` (addition)
- `-` (subtraction)
- `*` (multiplication)
- `/` (division)

It also respects order of operations (PEMDAS, anyone?). So `2 + 3 * 4` will give `14`, not `20`, because multiplication happens before addition. Use parentheses `( ... )` if you want to force a different order. For instance, `(2 + 3) * 4` gives `20` – straightforward enough 🧠.

One neat quirk: the division operator `/` **always** gives a float (decimal number) in Python, even if the result is a whole number `3`. Try this:

In [8]:
8 / 4

2.0

Instead of `2`, you'll get `2.0`. Python is telling you "hey, this result might have been fractional, so here's a float for consistency." Good to know, especially when you're dividing things like Netflix watch hours or revenue, where fractional results make sense.

And yes, Python can handle more than basic ops:
- `//` for floor division (chopping off the decimal, e.g. `5 // 2` gives `2`)
- `%` for modulus (remainder, so `5 % 2` gives `1`)
- `**` for exponent (power, so `2 ** 3` gives `8`)

Feel free to play with these in a code cell – Python won't mind. It's basically a gym for your brain's math muscles.

## Comments: Your Code's Post-It Notes

Ever jot down a note on a sticky note and slap it on a document for future-you to read? That's what **comments** are in code. They are completely ignored by Python, but incredibly useful for humans (like your future self, or the poor soul inheriting your code at Amazon).

In Python, anything after a `#` symbol is a comment:

In [9]:
# This is a comment. Python will happily ignore it when running the code.
print("Hello there!")  # This prints a greeting to the screen
# print("This line won't run because it's commented out.")

Hello there!


---
When you run this cell, it will output `Hello there!` and nothing more. The lines starting with `#` are not executed – they're just notes.

Use comments to explain *why* you're doing something, or to disable code without deleting it. For example, if you have a super complex calculation that you need to temporarily turn off, just slap a `#` in front of it. Future-you will thank present-you for any helpful breadcrumb left in comments. As a budding data scientist, writing clear comments can set you apart – it shows you can communicate your thought process (a skill much appreciated in team environments).

_Pro tip_: Keep comments concise and relevant. No need to write an essay – just clarify the non-obvious.  
And please, **no "joke" comments that might get you side-eye from a code reviewer**. Save your wit for conversation, not production code (lesson materials excluded, of course).


## Variables: Data Containers for Everything

Imagine working at Netflix and having to repeatedly type the number of subscribers (say, 23,000,000) all over your analysis code. Sounds error-prone and tedious, right? Instead, we use **variables** to store such values with a descriptive name, so we (and Python) can reuse them easily.

A variable is basically a label for a piece of data. In Python, you create (or **assign**) a variable using the `=` sign `5`. For example:

In [10]:
# Netflix scenario: calculate monthly revenue from subscribers
subscriber_count = 23_000_000    # number of subscribers (an integer)
price_per_user   = 12.99         # monthly price in dollars (a float)
monthly_revenue  = subscriber_count * price_per_user
print(monthly_revenue)

298770000.0


---
When you run this, Python will dutifully multiply 23,000,000 by 12.99 and spit out the result (which is a big number, roughly 298 million). We just stored data in variables and used them in a calculation – congrats, you’re automating work like a pro. No abacus needed.

A few things to note about variables:
- **Naming**: Use meaningful names. `subscriber_count` is way better than `x` or `var1`. However, the name can't start with a number or contain spaces. Stick to letters, numbers, and underscores `_` (and no, `price-per-user` won't work – Python will think you're trying to subtract something). Also, variables are case-sensitive: `Revenue` and `revenue` would be different.
- **Dynamic Typing**: Python doesn’t require you to declare a type for a variable. It figures it out at runtime. In our example, it knows `subscriber_count` is an int and `price_per_user` is a float based on the value assigned. If later you do `subscriber_count = "twenty three million"`, Python will happily change the type to string – _but please don’t do that in sensible code!_ Keep types consistent unless you have a good reason to change them.
- **Using Variables**: Once assigned, you can use the variable name instead of the value it holds. Python will substitute it behind the scenes. If you try to use a variable that hasn’t been assigned yet, you’ll get a not-so-friendly `NameError` `6`. So, always assign first, then use.

Think of variables as sticky notes holding data. You can stick them on all sorts of data types (numbers, text, etc., which we'll cover next). Just make sure your sticky note labels (variable names) make sense for what's on them.

## Integers and Floats in Action

Say you're at **Apple** analyzing iPhone sales:

In [11]:
# Apple scenario: Calculate revenue from iPhones sold
num_iphones_sold = 5_000_000    # 5 million iPhones sold (int)
price_per_iphone = 999.99       # price in USD (float)
total_revenue = num_iphones_sold * price_per_iphone
print(total_revenue)

4999950000.0


---
This will output a large number (in the billions). Here, `num_iphones_sold` is an integer (no decimals for count of items), and `price_per_iphone` is a float (money often isn't a whole number). Python multiplies an int by a float and gives a float result. As mentioned, whenever you mix ints and floats, the int is converted to float behind the scenes `7`. The result `total_revenue` is a float.

Now, imagine **Google** budgeting for an ad campaign:

In [12]:
google_ads_budget = 1_000_000.0   # 1 million dollars, underscores for readability
num_channels = 4
budget_per_channel = google_ads_budget / num_channels
print(budget_per_channel)

250000.0


---
We used a float for the total budget and an int for number of channels. The output will be `250000.0` (a float). Notice we wrote `1_000_000.0` with underscores – that's just Python syntax sugar to make big numbers easier to read (Python ignores the underscores when interpreting the number). We got a float result because, again, float divided by int yields float. No surprise there.

**Key takeaway**: Use `int` when you need whole numbers, `float` when you need decimals. And be mindful of the type – if you treat a float like an int or vice versa accidentally, you might get weird results or errors. (E.g., using floats where you expect perfect precision can bite you due to rounding issues – a topic for later, but keep it in mind.)


## Strings in Action

Now let's switch gears to strings. Suppose you're a data analyst at **Meta (Facebook)** working on a welcome message for new users:

In [13]:
user_name = "Zuckerberg"  # just kidding, we're all new users at some point
welcome_message = "Welcome to the platform, " + user_name + "!"
print(welcome_message)

Welcome to the platform, Zuckerberg!


---
When you run this, you'll get `Welcome to the platform, Zuckerberg!`. Here we concatenated (glued together) strings using the `+` operator. Python happily joins the text pieces into one because they were all strings.

A few string insights:
- You can use either single quotes `'...'` or double quotes `"..."` to define a string in Python `9`. Just be consistent and make sure they match. For example, `"Hello"` and `'Hello'` are both the same string.
- If you need quotes inside your string, use the opposite quote to avoid confusion or escape them with a backslash. (E.g., `"She said 'hi'"` or `'She said "hi"'` both work.)
- Strings aren't for math. Attempting `user_name + 5` will throw a TypeError faster than you can say "oops".

You can only concatenate strings with strings. If you have a number that you want to stick in a string, you have to convert it to a string first (using the `str()` function), or use fancy techniques like f-strings (we'll get there). For instance, `str(5)` gives `"5"`, so `"User" + str(5)` would be `"User5"`. Just remember: **no mixing types in plain `+` operations** – it's either math with numbers or concatenation with strings.

One more thing: if you put a number in quotes, it's a string, not a number. `"123" + "456"` will give you `"123456"` (concatenation) not `579`. This might seem obvious now, but it's a common beginner mistake when dealing with things like reading input. Python isn't going to assume you meant numeric addition if you gave it strings.

## In-Notebook Practice: Try It Yourself

Throughout this lesson, we've sprinkled code examples. If you haven't already, go ahead and run them in your Jupyter Notebook. Change the values, break things (intentionally) to see what errors pop up, and make sure you understand *why*.

For instance, try these mini-experiments:

In [14]:
# Experiment 1: What happens if we treat numbers as strings?
print("2" + "3")    # Expect "23" as strings will concatenate
# print(2 + "3")     # Uncomment this line to see a TypeError

23


In [15]:
# Experiment 2: Division quirks
print(8 / 5)    # Should be 1.6 (float division)
print(8 // 5)   # Floor division, should be 1 (int result)
print(8 % 5)    # Modulus, should be 3 (remainder of 8/5)

1.6
1
3


---
Try messing around: change `8` to other numbers, try adding an integer to a float, etc. The best way to learn is by doing — and the stakes here are low. No one gets paged at 2 AM if you divide by zero in your notebook (though you'll get an error for that, obviously).

By now, you should feel more comfortable with Python acting as a basic calculator, using comments to annotate code, storing values in variables, and recognizing ints, floats, and strings. These might seem like *baby steps*, but even senior data scientists rely on these fundamentals daily. Master this, and you've earned the right to tackle bigger fish.

# Assessments

Time to test your understanding with some hands-on tasks and questions. Remember, this is all about being *job-ready*, so these are inspired by things you might actually do or be asked in a data role.

## Coding Challenges

1. **Revenue Calculator**: You're working at **Netflix** (woohoo!). The company has 1,000,000 users, each paying $15.99 per month. Using Python variables, calculate the total monthly revenue and print it out.
_Hint_: You'll need two variables (for user count and price), and one for the result.

2. **Budget Split**: **Google** allocated a budget of $2,500,000 for cloud infrastructure this quarter. If they plan to split this evenly across 5 teams, how much does each team get? Write code to compute and display the result. (Use variables for each value, of course.)

3. **String Genie**: At **Amazon**, an analyst is preparing a product launch email. They have a `product_name = "Echo Dot"` and need to create a message that says `"The new Echo Dot is available now!"`. Use string concatenation (or any other method you know) to form this message in Python and print it.
_Bonus_: *If you know about f-strings from curiosity or another course, you can use that, but concatenation with `+` or multiple prints is fine too.*

## Multiple Choice Questions

1. At **Apple**, you’re tracking the price of a new iPhone model. It costs $999.99. What data type is best for this price variable?
    A. `int`  
    B. `float`  
    C. `str`  
    D. `bool`

2. Which of the following is **NOT** a valid Python variable name?
    A. `employee_count`  
    B. `SalesQ1`  
    C. `3rd_quarter_revenue`  
    D. `totalRevenue`

_(Questions 4–5 explore your understanding of data types and basic syntax in realistic scenarios. Think carefully about how Python treats each case before you answer.)_

# Summary and Cheat Sheet

Congrats on making it through Lesson 2 with your sanity (hopefully) intact. We've covered a lot of ground, and you’re now equipped to use Python as a calculator, add helpful comments, wrangle variables, and distinguish core data types. Boring? Maybe a little. But this is the bedrock of all Python programming – *every* script, data analysis, or machine learning model builds on these concepts. The upside: you're inching closer to that dream job at Facebook, Apple, or whichever big-name tech giant gives you goosebumps.

Here's a quick recap (**cheat sheet**) of the essentials from this lesson:

- **Basic Arithmetic in Python**: `+`, `-`, `*`, `/` for add, subtract, multiply, divide. Use `()` to group operations. `//` for floor division, `%` for remainder, `**` for powers. Remember, `/` always gives a float.
- **Comments**: Use `#` to start a comment. Python ignores everything after `#` on that line. Use comments to clarify code or prevent execution of code snippets during testing.
- **Variables**: Assign with `name = value`. Once assigned, you can use `name` in place of the value. Choose descriptive names and follow naming rules (start with letter/underscore, no spaces or special chars except `_`). Python is dynamically typed, so it infers the type from the value.
- **Integers** (`int`): Whole numbers (e.g., -10, 0, 42). Use for counts, indices, etc.
- **Floats** (`float`): Decimal numbers (e.g., 3.14, 2.0). Use for measurements, currency, or any fractional values. Mixing int and float in operations results in float.
- **Strings** (`str`): Text enclosed in quotes, single or double. Use for names, labels, or any non-numeric data. You can concatenate strings with `+`. But you can't add strings and numbers without conversion (that leads to a TypeError).

Keep this cheat sheet handy. As trivial as these basics might seem, they will become second nature with practice – which is exactly what you should do now. Play around with code, tackle the assessments, and don't worry if you make mistakes. Even senior devs Google "Python string concatenate int" sometimes (we won’t tell 😉).

In the next lesson, we’ll build on this foundation and start making Python do *actual* data science-y things. Until then, happy coding, and remember: **every expert was once a beginner.** That beginner just learned the basics really well.

---

# Lesson 3: Working with Text

So far, we've seen Python handle numbers like a math whiz, but data isn't just about digits. A lot of the world runs on text – names, product descriptions, user comments, you name it. In this lesson, we'll teach Python to become a wordsmith. Get ready to **work with text** using Python's strings (and no, we don't mean guitar strings 🎸 we mean text data). By the end of this lesson, you'll know how to manipulate strings, format them cleanly (hello, f-strings!), and use some handy text tricks that will make you feel like the Shakespeare of code.

Why should you care? Because text data is everywhere. From cleaning up a messy email list to parsing social media hashtags, being good with strings is *essential* for a data pro. Plus, once you master this, you'll be able to create neat output like personalized messages and formatted reports, which is way cooler than just printing numbers all day.

Sound good? Great – let's dive into the world of Python strings!

## Strings: More Than Just Text

In Python (and programming in general), a **string** is simply a sequence of characters. You can think of it as text, whether it's a single letter, a word, or a whole paragraph. In Lesson 2, we briefly played with strings when we did things like `"Hello" + "World"`. Now we'll go deeper.

Why are strings important for data science? Imagine the data you might work with:

- User feedback comments
- Names of cities or products
- Dates and addresses
- Even DNA sequences (which are basically really long strings!)

Not everything is a neat number. Often, you need to wrangle text: clean it, combine it, or pull information out of it. Python's got your back here. It's *really* good at string manipulation (some say Python is almost as good with words as it is with numbers).

One quick cool fact: Python strings can handle **Unicode**, which means you can work with text in any language (and even emojis) without breaking a sweat. Want to print a smiley face? Python can do `print("😊")` just fine. In a global data science project, that matters. But let's start with the basics before we send emoji rockets to space 🚀.

## Combining Text and Variables (Concatenation & f-Strings)
Say you're working at a startup and you want to send a custom welcome message to each new user. You have their name in a variable, and you need to include it in a greeting string. This is where combining text and variables comes in.
**Concatenation** is a fancy word for sticking strings together. In Python, you can concatenate with the `+` operator. For example:

In [4]:
first_name = "Ada"
last_name = "Lovelace"
full_name = first_name + " " + last_name
print("Welcome, " + full_name + "!")

Welcome, Ada Lovelace!


---
**Output:**
```text
Welcome, Ada Lovelace!
```
Here we took `first_name` and `last_name`, added a space in between, and combined them. Easy enough, right? 🎯  
But what if we want to include something that’s not a string, like a number? Suppose we have an `user = "Alex"` and they have `messages = 5` new notifications. If we try concatenation directly:

In [8]:
user = "Alex"
messages = 5
# print("Hello, " + user + "! You have " + messages + " new messages.")  # <-- gonna error out

This will error out with a `TypeError`. Python will yell at you because you're trying to add a string and an integer. It's like trying to glue a piece of paper to a number 5 – it just doesn't know how to do that.  
**Fix #1:** Convert the number to a string using `str()`:

In [9]:
print("Hello, " + user + "! You have " + str(messages) + " new messages.")

Hello, Alex! You have 5 new messages.


---
That works! But let’s be honest, it looks a bit clunky with all those plus signs and `str()` calls.  
**Fix #2 (better):** Use an f-string. 🎉  
F-strings are one of Python's coolest features for making your strings dynamic and readable. An f-string is just a normal string with an `f` in front, and you can put variables or even expressions inside curly braces `{}` to plug their values in:

In [10]:
print(f"Hello, {user}! You have {messages} new messages.")

Hello, Alex! You have 5 new messages.


---
**Output:**
```text
Hello, Alex! You have 5 new messages.
```

No fuss, no muss. Python sees the `f` and knows to replace `{user}` with the value of the `user` variable, and `{messages}` with the value of `messages`. You can even do math or function calls inside the `{}` if you want (like `{messages + 1}` would show `6` in this case). F-strings are not only cleaner, they're also faster and less error-prone than doing a bunch of concatenations. 🎯  
A quick note: f-strings are available in Python 3.6 and above (which you almost certainly have nowadays). Before f-strings, people used the `.format()` method or the `%` operator for string formatting. You might see those in older code or StackOverflow answers, but we'll stick to f-strings here because they're the newest and easiest way. (If someone in an interview asks, you can mention all three methods – that'll earn you a 👏 for knowing your stuff.)

One more example to drive it home:  
Imagine you're analyzing sales at Amazon. You have a `product = "Echo Dot"` and `quantity_sold = 12000`. You want to report, "We sold 12000 Echo Dot units." Using an f-string:

In [11]:
product = "Echo Dot"
quantity_sold = 12000
print(f"We sold {quantity_sold} {product} units.")

We sold 12000 Echo Dot units.


**Output:**
```text
We sold 12000 Echo Dot units.
```
See how painless that was? No need to worry about converting `quantity_sold` to a string – the f-string handled it.  
**Key takeaway:** for mixing text with numbers or variables, f-strings are your new best friend.

## Useful String Methods (Changing Case, Stripping, and More)
Now that you can build strings that include variable values, let's look at how to manipulate string content itself. Python strings come with a bunch of built-in functions called **methods** that make common text-processing tasks easy.  
Think of string methods as power-ups for your text. Here are some you'll use a lot:

- **Change case**: Want to shout in all caps or whisper in lowercase? Use:  
  - `upper()` – returns an uppercase version of the string.  
  - `lower()` – returns a lowercase version.  
  - `title()` – capitalizes the first letter of each word (useful for names, etc.).

Example:  

In [12]:
company = "NeuraCamp"
print(company.upper())   # "NEURACAMP"
print(company.lower())   # "neuracamp"
print("data science".title())  # "Data Science"

NEURACAMP
neuracamp
Data Science


🧠 These methods don’t change the original string; they just return a new modified string. (We'll explain why in a bit.)

- **Strip whitespace**: Ever get data with random spaces or newlines at the edges? `" hello \n"` with spaces or newline around it, for instance. `strip()` is your janitor:  
  - `strip()` – removes any whitespace (spaces, tabs, newlines) from the beginning **and** end of a string.  
  - `rstrip()` – removes whitespace on the right end (end of the string) only.  
  - `lstrip()` – removes whitespace on the left end (start of the string) only.

Example:  

In [13]:
raw_text = "   Data Science is cool!   \n"
cleaned_text = raw_text.strip()
print(cleaned_text)  # "Data Science is cool!"

Data Science is cool!


This is super handy when cleaning up user input or data from files that might have accidental spaces. No more weird bugs from an extra space messing up a comparison!

- **Find and replace**: Need to find a substring or replace part of a string?  
  - `replace(old, new)` – replaces all occurrences of the substring `old` with `new`.  
  - `find(sub)` – returns the index of the first occurrence of `sub` in the string (or -1 if not found).

Example:  

In [15]:
sentence = "Python is awesome! Python is versatile."
print(sentence.replace("Python", "Data Science"))
# "Data Science is awesome! Data Science is versatile."
print(sentence.find("awesome"))
# this might output an index like 12 (position where "awesome" starts)

Data Science is awesome! Data Science is versatile.
10


`replace` is great for quick substitutions (like censoring a word or updating a phrase), and `find` helps when you need to locate something in the text. (There's also a similar method `index()` which is like `find` but will error if the substring isn’t found, whereas `find` gives -1. Use whichever you prefer.)

These are just a few of the string methods available. Python has a whole bunch more (like checking if a string is all digits, splitting a string into a list of words, etc.), but the ones above are enough to cover a lot of common tasks when starting out.

**Important:** Remember how we said these methods don’t change the original string? That’s because Python strings are **immutable**. Immutable means **unchangeable**. Once a string is created, you can’t modify it in place. Any operation that transforms a string (like making it uppercase or replacing parts of it) actually creates a brand new string and leaves the original untouched.

For example:  

In [16]:
name = "Ada"
name.upper()
print(name)        # Still "Ada", because upper() didn't change it in place
name = name.upper()
print(name)        # Now it's "ADA", because we reassigned the result back to name

Ada
ADA


In the first `name.upper()` call, Python gave us a new string "ADA" but we didn’t save it, so it was lost to the ether. By assigning `name = name.upper()`, we caught that new string in the `name` variable. 🧠 Immutability might seem like a nuisance (why not just change the thing directly, right?), but it has benefits for performance and avoiding accidental side-effects. Just keep in mind: if you use a string method and want to keep the result, assign it to a variable (maybe the same variable if you want to update it).

One more neat trick: Python lets you repeat strings using the `*` operator. This isn’t a method, but it’s too fun not to mention. If you want to laugh "ha" three times: `"ha" * 3` gives `"hahaha"`. Useful for quick text generation or just adding some flair:

In [17]:
print("-" * 10)  # prints a line of 10 dashes, like "----------"

----------


This can create simple separators or dividers in your output. Not an everyday tool, but one to keep in your back pocket.

## In-Notebook Practice: Try It Yourself

Time to roll up your sleeves and play with some strings. Open your Jupyter Notebook and try these mini-experiments to reinforce what you’ve learned:

### Experiment 1: F-String vs. Concatenation  
Create a variable `user = "Taylor"` and `points = 42`. Try printing a message like `"Taylor has 42 points"` using concatenation (with `+` and `str()`) for the number and then with an f-string. Example:

In [18]:
user = "Taylor"
points = 42

# Using concatenation:
print("Congrats, " + user + "! You scored " + str(points) + " points.")

# Using an f-string:
print(f"Congrats, {user}! You scored {points} points.")

Congrats, Taylor! You scored 42 points.
Congrats, Taylor! You scored 42 points.


Both lines should output the same thing. Which one do you find cleaner? 🧠

### Experiment 2: Testing String Methods  
Take a string with weird casing or extra spaces and clean it up. For instance:

In [19]:
messy = "   PYThon is FUN!  \n"
print(messy.lower())        # see it all in lowercase
print(messy.strip())        # remove surrounding whitespace/newline
print(messy.upper().strip())  # upper-case and strip in one go

   python is fun!  

PYThon is FUN!
PYTHON IS FUN!


Observe how each method works. Try adding a `.replace("FUN", "powerful")` at the end of one of those and see what you get. The possibilities are endless when you chain methods like this.

### Experiment 3: Immutability Check  
Confirm that strings are immutable by doing:

In [20]:
s = "immutable"
s.upper()
print(s)        # s will still be "immutable"
s = s.upper()
print(s)        # now s is "IMMUTABLE"

immutable
IMMUTABLE


This will show you that without reassigning, `s` stays the same after `s.upper()`. With reassignment, it changes. It's a little nuance of Python that's worth understanding early on.

Mess around with other string methods if you’re curious. For example, try `s.capitalize()`, `s.find("mut")`, or even `len(s)` to get the length of the string. Python is a playground, and the more you experiment, the more comfortable you’ll get. No harm done if you break something in the notebook — that’s what the reset kernel button is for 😄

## Assessments

Time to test your understanding of string manipulation. These exercises are practical and reflect things you might actually do in a data role. Give them a shot:

### Coding Challenges

1. **Personalized Welcome**: You're writing a program for a retail website to greet new users. You have a variable `customer_name = "Jane"` and a variable `account_balance = 150`. Use an f-string to print a message like: `"Hello, Jane! Your balance is $150."` (Don't forget to include the `$` in the message. The number is an int, but f-strings will handle that for you.)

2. **All Uppercase Shout-Out**: At a team meeting, you want to display the project code name loudly. You have `project = "Zeus"`. Write code to print the project name in all uppercase letters, prefixed by `"PROJECT: "`. The output should be: `PROJECT: ZEUS`. (Hint: use a string method to upper-case the name.)

3. **Clean That Input**: Suppose you asked a user to enter their country and they accidentally added spaces before and after the name. For example, `country = "   Canada   "`. Write code to strip the whitespace and print a message: `"Selected country: Canada"` (You should get rid of the extra spaces before concatenating or formatting the string.)

### Multiple Choice Questions

1. You have `name = "Alice"`. What is the difference between `print(name)` and `print("name")`?  
   A. `print(name)` will output Alice, while `print("name")` will output the literal word *name*.  
   B. There is no difference. Both will output *Alice*.  
   C. `print(name)` causes an error, but `print("name")` works.  
   D. `print("name")` will try to find a variable called *name*.

2. Which of the following code snippets will **NOT** produce an error in Python? (Assume the variables exist where needed.)  
   A. `age = 30; print("I am " + age + " years old")`  
   B. `age = 30; print("I am " + str(age) + " years old")`  
   C. `age = "30"; print(f"I am {age} years old")`  
   D. `print(f"I am {30} years old")`

*(Choose the one correct answer)*

---

## Lesson Summary & Cheat Sheet

That was a jam-packed lesson! 🎉 Let’s recap the high points and lock in what you’ve learned about working with text in Python:

- **Strings are for text:** A string (`str` type) holds characters like letters, numbers, symbols – basically any text. You create one by putting text in quotes. E.g., `"Hello"` or `'World'` or even `""` (empty string).

- **Combining strings:** Use `+` to concatenate (join) strings. But if you need to include non-string data (like numbers), you must convert it to a string first or use a formatting method. Attempting `"Age: " + 30` will error – you should do `"Age: " + str(30)` or use an f-string.

- **F-Strings (Formatted Strings):** Prefix a string with `f` and put variables in `{}` inside it. Example: `f"{name} is {age} years old"` will replace the placeholders with the values of `name` and `age`. This is the modern, clean way to build strings with dynamic content (way easier than old `%` or `format` methods).

- **Common string methods:** Python gives you built-in functions to tweak text:
  - `s.upper()` → returns an uppercase version of string `s`.
  - `s.lower()` → returns a lowercase version.
  - `s.strip()` → removes leading/trailing whitespace. (Use `s.rstrip()` or `s.lstrip()` for just right or left side.)
  - `s.replace("old", "new")` → swaps all occurrences of `"old"` with `"new"` in `s`.
  - `s.find("sub")` → finds the index of `"sub"` in `s` (or -1 if not found).

  *(There are many more, but these will cover 90% of your needs starting out.)*

- **Strings are immutable:** Once a string is created, it cannot be changed in place. All string methods that modify text will return a new string. Always remember to store that new string in a variable if you want to use it later. (If you call methods without assignment, the original stays the same.)

- **Length of a string:** Use the `len()` function to get the number of characters in a string. E.g., `len("Hello")` returns `5`. This counts letters, numbers, spaces, punctuation – every character. This is useful when you need to check things like message length or iterate through text.

Keep this cheat sheet handy as you practice. Mastering strings means you’ll be comfortable dealing with text data, which is a huge part of real-world data science (think tweets, logs, user input, etc.).

In the next lesson, we’ll move on to a new kind of structure: **lists**. That’s right – you’ll learn how to store and manage collections of items (like a list of user names or a series of numbers) and do cool things with them. But that’s for next time. For now, give yourself a pat on the back for conquering text manipulation in Python. You’ve added a crucial tool to your data science toolkit!

---

## Python Strings Trivia That Actually Shows Up in Interviews

Some interviewers love to quiz you on fundamentals to make sure you really know your stuff. Here are a few string-related trivia bits that might pop up (and now you'll be ready to answer):

- **Are Python strings mutable or immutable?** They are **immutable**. This means you cannot change a string once it’s created — any operation that modifies a string will produce a new string. (Interviewers ask this to gauge if you understand how Python manages memory for things like strings.)

- **What does the 'f' in f-string stand for?** It stands for **formatted string literal**. F-strings were introduced in Python 3.6 as a way to embed expressions inside string literals conveniently. They’re called "formatted" strings because they let you format/insert values on the fly. (If you mention PEP 498 — the Python Enhancement Proposal that introduced f-strings — you’ll sound like a true Python nerd 🧠.)

- **How else can you format strings in Python?** Besides f-strings, you can use the older `%` operator (old C-style) or the `.format()` method. For example, `"Name: %s" % name` or `"Name: {}".format(name)` achieve similar outcomes. F-strings are generally preferred now for their clarity and efficiency.

- **Why are they called "strings"?** The term "string" comes from the idea of a “string of characters” threaded together. Historically, even back in older languages like C, a string was seen as a sequence (or chain) of characters in memory. No relation to guitar strings, but it’s a fun image to picture characters linked one after another!

- **Unicode support:** Python 3 stores strings in Unicode, meaning it can handle text from just about any written language (and symbols like emojis) natively. This is why you can do `print("¡Hola! 😀")` and it works just fine. If you’re ever asked, “Can Python handle international characters or accents?” – the answer is a resounding yes.

Keep these tidbits in your back pocket. They not only make you better at understanding Python’s behavior, but they also come in handy when you need to impress in an interview or explain a tricky bug to a colleague. Now you’ve got the know-how to tackle text like a pro! Let’s move on to the next chapter and keep the momentum going. Happy coding!

---
