diff --git a/lessons/Part1/00_workshop_setup.ipynb b/lessons/Part1/00_workshop_setup.ipynb index 4a1117a..5cc1f86 100644 --- a/lessons/Part1/00_workshop_setup.ipynb +++ b/lessons/Part1/00_workshop_setup.ipynb @@ -4,21 +4,30 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### Before the workshop, complete the following steps to make sure that you are ready to start with the material for Part 1!\n", + "#### Before the workshop, read and complete the following steps to make sure that you are ready to start with the material for Part 1!\n", "\n", "# How do I run Python on my machine?\n", "\n", "There are a few options for running Python: \n", "\n", - "1) **Jupyter Notebook:** All files in this workshop are Jupyter notebooks with the extension `.ipynb`. The key feature of a Jupyter notebook is its organization around *executable cells*. Jupyter notebooks are the most common format you'll encounter with Python programming revolving around data processing and analysis. They can be initialized with Anaconda Navigator or via the command line.\n", + "**Recommended:**\n", + "- **Jupyter Notebook/JupyterLab:** All files in this workshop are Jupyter notebooks with the extension `.ipynb`. The key feature of a Jupyter notebook is its organization around *executable cells*. Jupyter notebooks are the most common format you'll encounter with Python programming revolving around data processing and analysis. Like a word document, Jupyter notebooks need to be opened in an appropriate app. Two most common options (Jupyter Notebook and JupyterLab) come installed with Anaconda (which you will install in the steps below). They can be opened with Anaconda Navigator or via the command line.\n", "\n", - "2) **Google Colab:** Similar in layout to a Jupyter notebook. The key difference is that Colab is hosted on Google servers rather than being run locally on your machine. Colab notebooks have more support for parallel processing (useful for large datasets and complex models), and don't require a Python distribution on your machine. They're similarly in an `.ipynb` format.\n", + "**Next best options:**\n", + "- **DataHub:** A cloud-based service at UC Berkeley for interacting with Jupyter Notebooks. If you haven't been able to install Anaconda on your machine, or it runs too slowly, you can use DataHub to host your materials on the cloud. To do so, click on the Launch DataHub button in the Readme of the repository.\n", "\n", - "3) **Spyder (or another [IDE](https://en.wikipedia.org/wiki/Comparison_of_integrated_development_environments#Python)):** An integrated development environment (IDE) is a software application that provides a host of tools for debugging and development. They are usually used for running Python script files with extension `.py`, though newer IDEs are adding functionality for Jupyter notebooks. Spyder comes installed with the Anaconda distribution.\n", + "- **Binder:** Another cloud-based service for interacting with Jupyter Notebooks. This is like DataHub but a little slower. Use this alternative if you weren't able to complete the steps below, but don't have a UC Berkeley log-in.\n", "\n", - "4) **Command Line:** It is also possible to run `.py` scripts directly in your terminal or command line (and even write the script in a text editor). This is not typically used for active development because it lacks the editor and tools of the above methods, but is sometimes used for executing scripts quickly.\n", "\n", - "For this workshop, we will use the Anaconda distribution of Python and the included Jupyter Notebook application. Before the first workshop, complete the following steps to get the workshop materials ready.\n", + "**Other alternatives:**\n", + "\n", + "- **Google Colab:** Another interface for working with Jupyter notebooks. The key difference is that Colab is hosted on Google servers rather than being run locally on your machine. Colab notebooks have more support for parallel processing (useful for large datasets and complex models), and don't require a Python distribution on your machine. \n", + "\n", + "- **Spyder (or another [IDE](https://en.wikipedia.org/wiki/Comparison_of_integrated_development_environments#Python)):** An integrated development environment (IDE) is a software application that provides a host of tools for debugging and development. They are usually used for running Python script files with extension `.py`, though newer IDEs are adding functionality for Jupyter notebooks. Spyder comes installed with the Anaconda distribution.\n", + "\n", + "- **Command Line:** It is also possible to run `.py` scripts directly in your terminal or command line (and even write the script in a text editor). This is not typically used for active development because it lacks the editor and tools of the above methods, but is sometimes used for executing scripts quickly.\n", + "\n", + "For this workshop, we will use the Anaconda distribution of Python and the included Jupyter Notebook or JupyterLab application. Before the first workshop, complete the following steps to get the workshop materials ready.\n", "\n", "\n", "## Step 1: Install Anaconda\n", @@ -68,7 +77,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.12" + "version": "3.9.12" } }, "nbformat": 4, diff --git a/lessons/Part1/01_introduction.ipynb b/lessons/Part1/01_introduction.ipynb index bf6481c..6a5df28 100644 --- a/lessons/Part1/01_introduction.ipynb +++ b/lessons/Part1/01_introduction.ipynb @@ -17,7 +17,7 @@ "\n", "Most programmers can program in more than one language. That's because they know *how to program* generally, as opposed to \"knowing\" Python, R, Matlab, or any other language.\n", "\n", - "In other words, programming is an extendible skill. Basic programming concepts - conditionals, for loops, functions - can be found in almost any programming language with small variations. Thus \"knowing how to program\" is not an exercise in memorization. \n", + "In other words, programming is an extendible skill. Basic programming concepts can be found in almost any language with small variations. Thus \"knowing how to program\" is not an exercise in memorization. \n", "\n", "Even within Python, there are too many packages and functions to memorize. Rather, a programmer knows 1) general structures and programming logic, 2) how to find and use new functions, and 3) how to work through problems that arise. \n", "\n", @@ -38,14 +38,19 @@ "1. State the goals of your code as clearly as possible.\n", "2. Plan out the general logic of steps needed to achieve the goal.\n", "3. Translate the steps into code:\n", - " a. Build up steps piece by piece.\n", - " b. Test frequently to make sure code is working as expected and handle bugs as quickly as possible.\n", + " 1. Build up steps piece by piece.\n", + " 2. Test frequently to make sure code is working as expected and handle bugs as quickly as possible.\n", + "4. Check the output.\n", "\n", - "For each option, it is useful to predict what you think the output should look like, then compare that to the output of your code. These steps can help you code more effectively and make it easier to deal with issues that do come up, but there is still no way to entirely avoid bugs.\n", + "These steps can help you code more effectively and make it easier to deal with issues that do come up, but there is still no way to entirely avoid bugs. However, debugging is a learnable skill just like any other part of coding!\n", "\n", "## Debugging\n", "\n", - "Here's a useful mental workflow to keep in mind when you want to try and debug an error:\n", + "There are two kinds of bugs in coding:\n", + "1. Those that give an error message. We will practice reading lots of error messages in this workshop but remember -- error messages are your friend!\n", + "2. Those that don't give an error message, but give an incorrect or nonsensical output. These are harder to find, but not impossible! \n", + "\n", + "It all comes down to understanding **where** the error occurs, and **why**, then you can make a plan for **how** to work around it. Here's a useful mental workflow to keep in mind when you want to try and debug an error:\n", "\n", "1. Read the errors!\n", "2. Read the documentation.\n", @@ -70,6 +75,13 @@ "\n", "Don't reinvent the wheel - learning how to find the answer to the issues you run into is a critical part of becoming a capable programmer!" ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { @@ -88,7 +100,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.12" + "version": "3.9.12" } }, "nbformat": 4, diff --git a/lessons/Part1/02_jupyter_notebooks.ipynb b/lessons/Part1/02_jupyter_notebooks.ipynb index 8ef13dd..f8f0e89 100644 --- a/lessons/Part1/02_jupyter_notebooks.ipynb +++ b/lessons/Part1/02_jupyter_notebooks.ipynb @@ -8,9 +8,9 @@ "\n", "\n", "**Learning Objectives**:\n", - "- Launch the Jupyter Notebook, create new notebooks, and exit the Notebook.\n", - "- Create and run Python cells in a notebook.\n", - "- Learn key Python formatting principles\n", + "- Create, open, and exit notebooks.\n", + "- Edit and run Python cells in a notebook.\n", + "- Learn key Python formatting principles.\n", "* * * * *" ] }, @@ -20,16 +20,26 @@ "source": [ "## Navigating Jupyter Notebooks\n", "\n", - "In Jupyter Notebooks, code is typed into cells. Cells are individual units in which code can be separately run. In contrast to many IDEs, running code is done with a keyword combination: `Shift+Enter`. Running `Shift+Enter` on a selected cell will run the code in the cell and then automatically move your cursor to the following cell.\n", + "In Jupyter Notebooks, code is divided into cells which can each be run separately. This is the main distinction between Jupyter Notebook `.ipynb` format and Python script `.py` format. Running a cell is done with a key combination: `Shift+Enter`. `Shift+Enter` will run the code in the selecte cell and then automatically move to the following cell.\n", "\n", - "Try to run the following code using `Shift+Enter` now." + "Try to run the following code using `Shift+Enter` now. \n", + "\n", + "**Question:** What was the output of the code?" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 1, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Hello World!\n" + ] + } + ], "source": [ "print(\"Hello World!\")" ] @@ -38,16 +48,26 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "If you hit **Enter** only, Jupyter Notebook gives you another line in the current cell.\n", + "If you hit `Enter` only, Jupyter Notebook gives you another line in the current cell.\n", "\n", - "This allows you to compose multi-line commands and submit them to Python all at once." + "This allows you to compose multi-line commands and submit them to Python all at once.\n", + "\n", + "**Question:** What does the following cell do? What is the output?" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "3\n" + ] + } + ], "source": [ "a = 1 + 2\n", "print(a)" @@ -66,7 +86,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Try using `Control+Enter` to run this cell a few times. What happens?" + "**Question**: Try using `Control+Enter` to run this cell three times. What is the output? Run the cell one more time. Is the output the same?" ] }, { @@ -83,7 +103,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "If you want to create new empty cells, you can use Insert -> Insert Cell Below or use the Insert Cell Below button at the top of the notebook. Try entering a new cell below this one." + "If you want to create new empty cells, you can use Insert ==> Insert Cell Below or use the Insert Cell Below button at the top of the notebook. Try entering a new cell below this one." ] }, { @@ -92,11 +112,11 @@ "source": [ "## Markdown\n", "\n", - "Jupyter notebooks allow you type in Markdown as well as code. In fact, this very cell is written in Markdown! We use markdown to narrate the workshop and provide context. Markdown is also used for documentation of code in Python notebooks more generally.\n", + "Jupyter notebooks allow you combine text and code using a system called markdown. In fact, this very cell is written in markdown! We use this formatting language to narrate the workshop and provide context. (Imagine reading this notebook with no markdown!) Markdown is also used for documentation of code in Python notebooks more generally.\n", "\n", - "Markdown has its own syntax, but it's easy to learn. Here's a [cheatsheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) that can help.\n", + "Markdown has its own syntax, but it's fairly straighforward. Here's a [cheatsheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) that can help. You can also double-click on any of the markdown cells to see how they are made.\n", "\n", - "Double click the cell below to see the markdown code rendering the output." + "Double click the cell below to see the markdown code rendering the output. Then do `Shift+Enter` to go back to the formatted text." ] }, { @@ -105,16 +125,24 @@ "source": [ "## Clearing Jupyter\n", "\n", - "Jupyter remembers everything it executed, **even if it's not currently displayed in the notebook**.\n", + "Jupyter remembers line of code it executed, **even if it's not currently displayed in the notebook**. This means that deleting a line of code does not delete it from the notebook's memory if it has already been run. Instead, to clear everything from Jupyter use Kernel -> Restart in the menu. The kernel is basically the program actually running the code, so if you reset the kernel, it's as if you just opened up the notebook for the first time. All of the outputs are forgotten, and the variables are reset.\n", "\n", - "To clear everything from Jupyter use Kernel -> Restart in the menu." + "Let's see how this actually works. First, run the cell below. What is the output?" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 6, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "And three shall be the count.\n" + ] + } + ], "source": [ "mystring = \"And three shall be the count.\" \n", "\n", @@ -125,14 +153,22 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now use Kernel -> Restart in the menu! You can also press the \"Reset\" button in the icon bar." + "Now use Kernel -> Restart in the menu! You can also press the \"Reset\" button in the icon bar. Then run the code below. What happens?" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 7, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "And three shall be the count.\n" + ] + } + ], "source": [ "print(mystring)" ] @@ -185,11 +221,11 @@ "source": [ "### Commenting\n", "\n", - "We will discuss how and why to comment code later in this series, but it's also useful when you temporarily don't want to run a section of code.\n", + "We will discuss how and why to comment code later in this series, but we will introduce it now because it's useful when you temporarily don't want to run a section of code.\n", "\n", "Simply place a pound sign `#` at the beginning of the line, and that line won't run. Any uncommented lines will be treated as code.\n", "\n", - "Try running the cell below, then comment out `bad_thing`, and run it again.\n" + "Try running the cell below, then comment out `bad_thing`, and run it again. What changes?\n" ] }, { @@ -257,7 +293,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.12" + "version": "3.9.12" }, "toc": { "base_numbering": 1, diff --git a/lessons/Part1/03_variables.ipynb b/lessons/Part1/03_variables.ipynb index 81291e0..1992df5 100644 --- a/lessons/Part1/03_variables.ipynb +++ b/lessons/Part1/03_variables.ipynb @@ -4,7 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Variables and Assignment\n", + "# Variables\n", "\n", "\n", "**Learning Objectives**\n", @@ -19,13 +19,12 @@ "source": [ "## Assigning and Printing Variables\n", "\n", - "Variables are a fundamental tool in programming.\n", "\n", "* Variables are placeholders for useful values that we want to refer to again later in the code.\n", "* In Python, the `=` symbol assigns the value on the right to the name on the left.\n", - "* The variable is created when a value is assigned to it. When you call the variable, it will refer to whatever value was assigned to it at that point in time.\n", + "* The variable is created when a value is assigned to it. When you call the variable, it will refer to whatever value it currently holds.\n", "\n", - "Here's Python code that assigns an age to a variable `age` and a name in quotation marks to a variable `first_name`." + "Here's Python code that assigns a year to a variable `year` and a month in quotation marks to a variable `month`." ] }, { @@ -34,12 +33,12 @@ "metadata": {}, "outputs": [], "source": [ - "age = 42\n", - "first_name = 'Ahmed'\n", + "year = 2020\n", + "month = 'July'\n", "\n", "# We can print variables with print()\n", - "print(age)\n", - "print(first_name)" + "print(year)\n", + "print(month)" ] }, { @@ -55,7 +54,7 @@ "metadata": {}, "outputs": [], "source": [ - "print(\"First name:\", first_name, \". Age:\", age)" + "print(\"Year:\", year, \".Month:\", month)" ] }, { @@ -71,7 +70,7 @@ "metadata": {}, "outputs": [], "source": [ - "print(\"First name: \", first_name, \". Age: \", age, sep='')" + "print(\"Year:\", year, \". Month:\", month, sep = '')" ] }, { @@ -90,9 +89,9 @@ "Not following these rules will result in an error in Python. \n", "\n", "* In addition, some **guidelines** for variable naming are:\n", - " * Python is case-sensitive (`First_name` and `first_name` are two separate variables).\n", - " * Use meaningful variable names (e.g. `first_name` is more informative than `x`).\n", - " * Be consistent in your formatting (e.g separate words the same way every time).\n", + " * Python is case-sensitive (`Year` and `year` are two separate variables).\n", + " * Use meaningful variable names (e.g. `year` is more informative than `x`). A good reference is that you should be able to tell what is going on in the code and variables without having to run it.\n", + " * Be consistent in your formatting (e.g avoid StartYear and Stop_year).\n", " * Avoid overlap with existing variables and functions (e.g., `print`, `sum`, `str`).\n", "\n", "While these won't result in an error directly, they may result in unexpected behavior in your code. In addition, the code may be harder to parse by other people (or future you!)." @@ -104,7 +103,10 @@ "source": [ "## Challenge 1: Debugging Variable Names\n", "\n", - "The following pieces of code include variable names that cause an error. What's wrong and how would we fix it? Are there any other guidelines that aren't being met with the variable names?" + "The following pieces of code include variable names that cause an error. For each block of code consider the following questions:\n", + "1. Which **rule** is being broken? Can you find this information in the error message?\n", + "2. What **guidelines** aren't being followed? \n", + "3. How would you change the code?" ] }, { @@ -136,10 +138,11 @@ "source": [ "## Variable Arithmetic\n", "\n", - "* We can use variables in calculations just as if they were values.\n", - "* Operators are shown in purple in a Jupyter Notebook. These are special symbols that tell Python to perform certain operations.\n", + "* The key feature of variables is that we can use them in calculations and functions just as if they were values.\n", + "* **Operators** (special symbols that perform calculations) are shown in purple in a Jupyter Notebook. These are special symbols that tell Python to perform certain operations.\n", + "* **Functions** are processes that perform multiple operations on variables. We will cover these in a later notebook. \n", "\n", - "Let's check out some common operations:" + "Let's check out some common operations below. Note what values get substituted in for the variables in each operation. " ] }, { @@ -175,9 +178,9 @@ "source": [ "## Challenge 2: Words to Code\n", "\n", - "Translate the following clause into code and save the result to a variable. \n", + "Translate the following line into code and save the result to a variable. \n", "\n", - "Divide 15 by the sum of a and three times b. Multiply the result by 2 and raise it to the 3rd power. What is the result? \n", + "Divide 15 by the sum of a and three times b. Multiply the result by 2 and raise it to the 3rd power. What is the result?\n", "\n", "**Hint**: Order of operations applies in Python (i.e., PEMDAS)." ] @@ -196,7 +199,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "How many lines of code did you do the challenge in? Can you do it in a single line?" + "How many lines of code did you do the challenge in? Can you do it in a single line? What did you name your variable?" ] }, { @@ -246,8 +249,15 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This is a common technique that is used for swapping variable values around." + "This is a common technique that is used for swapping variables around. However, often we might choose to just use new variables, rather than overwrite the ones here. Can you think of a reason why we might avoid overwriting a variable?" ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { @@ -267,7 +277,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.12" + "version": "3.9.12" }, "toc": { "base_numbering": 1, diff --git a/lessons/Part1/04_data_types.ipynb b/lessons/Part1/04_data_types.ipynb index 3dc1f85..42a0368 100644 --- a/lessons/Part1/04_data_types.ipynb +++ b/lessons/Part1/04_data_types.ipynb @@ -19,16 +19,29 @@ "source": [ "## What is a Data Type?\n", "\n", - "* Every value in a program has a specific **type**. Types tell Python how to interact with a variable. For example, you can use the division operation with numbers, but not variables containing text.\n", + "* Every value in a program has a specific **type**. Types tell Python how to interact with a variable. For example, you can use the division operation with numbers, but not text.\n", "* Sometimes types are obvious, but sometimes they can surprise us.\n", - "* We use the `type` function to identify what the type is of a current variable." + "* We use the `type()` **function** to identify what the type is of a current variable. Functions are signified by parentheses following them, which contain any inputs to the function.\n", + "\n", + "\n", + "Let's check the types of some variables below. Predict the type for each variable. Were you surprised by any of the answers?" ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\n" + ] + } + ], "source": [ "pi = 3.14159\n", "print(type(pi))\n", @@ -41,6 +54,13 @@ "print(type(pi2))" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Remember that when a variable is called, it takes the most recent value assigned to it, so calling `type(pi)` would be the same as `type(3.14159)`." + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -51,18 +71,39 @@ "\n", "* **int**: Whole numbers (e.g., `a = 2`).\n", "* **float**: Decimal numbers (e.g., `a = 2.01`).\n", - "* **str**: Strings, which denotes text (e.g., `a = \"2\"`).\n", + "* **str**: Strings, which denotes text (e.g., `a = \"2\"` or `a = '2'`).\n", + "\n", + "Operations and functions work differently for different types. For example, subtraction works with numeric types like floats, but not with strings.\n", "\n", - "Operations and functions work differently for different types. For example, subtraction works with numeric types like floats, but not with strings." + "**Note:** For strings you can use double or single quotes, as long as you are consistent." ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "metadata": { "scrolled": true }, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "1.1415899999999999\n" + ] + }, + { + "ename": "TypeError", + "evalue": "unsupported operand type(s) for -: 'str' and 'str'", + "output_type": "error", + "traceback": [ + "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", + "Input \u001b[1;32mIn [2]\u001b[0m, in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 2\u001b[0m \u001b[38;5;28mprint\u001b[39m(pi \u001b[38;5;241m-\u001b[39m \u001b[38;5;241m2.0\u001b[39m)\n\u001b[0;32m 4\u001b[0m \u001b[38;5;66;03m# Subtraction with strings\u001b[39;00m\n\u001b[1;32m----> 5\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[43mfitness\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m-\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[38;5;124;43ma\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m)\n", + "\u001b[1;31mTypeError\u001b[0m: unsupported operand type(s) for -: 'str' and 'str'" + ] + } + ], "source": [ "# Subtraction with floats\n", "print(pi - 2.0)\n", @@ -80,9 +121,18 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "5.14159\n", + "averagea\n" + ] + } + ], "source": [ "# Addition with floats\n", "print(pi + 2.0)\n", @@ -102,16 +152,28 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Every variable has a type, but there can be overlap between the kinds of values that can be in each type. For example, we can write a number as either an integer or a string. Python treats these differently, even if the underlying concept is the same:" + "Every variable has a type, but there can be overlap between the kinds of values that can be in each type. For example, we can write a number as either an integer or a string. Python treats these differently, even if to us the value is the same:" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 4, "metadata": { "scrolled": true }, - "outputs": [], + "outputs": [ + { + "ename": "TypeError", + "evalue": "unsupported operand type(s) for -: 'int' and 'str'", + "output_type": "error", + "traceback": [ + "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", + "Input \u001b[1;32mIn [4]\u001b[0m, in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m a \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m'\u001b[39m\u001b[38;5;124m3\u001b[39m\u001b[38;5;124m'\u001b[39m\n\u001b[0;32m 2\u001b[0m b \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m3\u001b[39m\n\u001b[1;32m----> 4\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[43mb\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m-\u001b[39;49m\u001b[43m \u001b[49m\u001b[43ma\u001b[49m)\n", + "\u001b[1;31mTypeError\u001b[0m: unsupported operand type(s) for -: 'int' and 'str'" + ] + } + ], "source": [ "a = '3'\n", "b = 3\n", @@ -128,11 +190,20 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 6, "metadata": { "scrolled": true }, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n" + ] + } + ], "source": [ "print(type(b))\n", "print(type(a))" @@ -149,9 +220,27 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "3\n" + ] + }, + { + "data": { + "text/plain": [ + "int" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "print(int(a))\n", "type(int(a))" @@ -159,9 +248,17 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0\n" + ] + } + ], "source": [ "print(b - int(a))" ] @@ -175,9 +272,21 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "ename": "ValueError", + "evalue": "invalid literal for int() with base 10: 'letters'", + "output_type": "error", + "traceback": [ + "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", + "Input \u001b[1;32mIn [9]\u001b[0m, in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[0m \u001b[38;5;28;43mint\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[38;5;124;43mletters\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[43m)\u001b[49m\n", + "\u001b[1;31mValueError\u001b[0m: invalid literal for int() with base 10: 'letters'" + ] + } + ], "source": [ "int('letters')" ] @@ -195,14 +304,23 @@ "source": [ "## Implicit Type Conversion\n", "\n", - "Python will automatically convert some types during operations. This is implicit type conversion, since you don't need to explicitly say what you are converting too. For examples, integers can be converted to floats as needed when performing arithmetic." + "Python will automatically convert some types during operations. This is implicit type conversion, since you don't need to explicitly say what you are converting to. For examples, integers can be converted to floats as needed when performing arithmetic." ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Half is 0.5\n", + "Three squared is 9.0\n" + ] + } + ], "source": [ "print('Half is', 1 / 2.0)\n", "print('Three squared is', 3.0 ** 2)" @@ -212,16 +330,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "You won't always have to explicitly convert types, which is an advantage in Python. However, this can cause unexpected behavior if you are not aware of it. For example, if we wanted the output of division to be an integer (called floor division), we would need to use another operation `//` in order to prevent the automatic conversion:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "1 // 2" + "You won't always have to explicitly convert types, which is an advantage in Python. However, this can cause unexpected behavior if you are not aware of it. Using `type()` liberally can help you check what is going on in the code." ] }, { @@ -237,7 +346,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 12, "metadata": {}, "outputs": [], "source": [ @@ -256,12 +365,12 @@ "\n", "Documentation for these methods can be accessed with `type.[METHOD_NAME]?`.\n", "\n", - "Let's look at the built-in method `upper`, which can be applied to strings:" + "Let's look at the built-in method [`upper`](https://python-reference.readthedocs.io/en/latest/docs/str/upper.html), which can be applied to strings:" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 13, "metadata": {}, "outputs": [], "source": [ @@ -270,9 +379,20 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'GIRAFFE'" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "giraffe = 'giraffe'\n", "giraffe.upper()" @@ -287,9 +407,20 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "'giraffe'.startswith('gir')" ] @@ -298,14 +429,27 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Methods can also be chained in a single line, as long as the output of one directly feeds into the input of the next. Will the output of the code below be True or False? Why?" + "Methods can also be chained in a single line, as long as the output of one directly feeds into the input of the next. These lines can be read sequentially left to right. Write out the steps that `giraffe` goes through in the following line. What do you think the final output will be?\n", + "\n", + "Next, run the code. Does the output match what you expected. If not, go back to each step and figure out what happened differently." ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "'giraffe'.upper().startswith('gir')" ] @@ -323,14 +467,17 @@ "source": [ "## Challenge 2: String Methods\n", "\n", - "Use `str.split?` to read the documentation for `str.split()`. Use `str.split()` on the following sentences. What happens? What does `sep=` do? Try using `sep='.'`. \n", + "1. Use `str.split()` on the following sentences. What is the output?\n", + "2. Use `str.split?` to read the documentation for `str.split()`. What does `sep=` do? Where have we seen `sep=` before?\n", + "3. Try using `sep='.'`. What is the output?\n", + "4. What is the default value of `sep=`?\n", "\n", "**Bonus**: What is the type of the output?" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 17, "metadata": {}, "outputs": [], "source": [ @@ -342,7 +489,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 18, "metadata": {}, "outputs": [], "source": [ @@ -355,19 +502,28 @@ "source": [ "## Challenge 3: Replacing a character\n", "\n", - "In the filename below, we want to remove spaces `' '` and replace it with an underscore `_`. Use the [string methods](https://docs.python.org/3/library/stdtypes.html#string-methods) and identify an appropriate method. Use that method to get the result: `\"Firstname_Lastname.csv\"\n", + "Let's say we have a bunch of filenames with spaces in them. However, we want to remove spaces `' '` and replace them with underscores `_`. Use the [string methods](https://docs.python.org/3/library/stdtypes.html#string-methods) reference and identify an appropriate method. Use that method to get the result: `\"Firstname_Lastname.csv\"\n", "\n", "**Bonus**: There are always more than one way to solve a programming problem. How many different ways can you solve the problem above?" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "sentence4 = \"Firstname Lastname.csv\"" ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# YOUR CODE HERE" + ] } ], "metadata": { @@ -386,7 +542,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.12" + "version": "3.9.12" }, "toc": { "base_numbering": 1, diff --git a/lessons/Part1/05_functions.ipynb b/lessons/Part1/05_functions.ipynb index 0454f1d..74530a6 100644 --- a/lessons/Part1/05_functions.ipynb +++ b/lessons/Part1/05_functions.ipynb @@ -10,7 +10,7 @@ "- Define functions and arguments.\n", "- Call Python functions.\n", "- Correctly nest calls to built-in functions.\n", - "- Use help to display [documentation](https://github.com/dlab-berkeley/python-intensive/blob/master/Glossary.md#documentation) for built-in functions.\n", + "- Use help to display [documentation](https://github.com/dlab-berkeley/Python-Fundamentals/blob/main/glossary.md) for built-in functions.\n", "\n", "*****" ] @@ -21,7 +21,7 @@ "source": [ "## Functions\n", "\n", - "**Functions** are a core part of programming that allows us to run complex operations over and over without needing to write the code again. **Arguments**, or values passed to a function, allow for us to use functions in more general ways.\n", + "**Functions** are a core part of programming that allows us to run complex operations over and over without needing to write the code over and over again. **Arguments**, or values passed to a function, allow for us to use functions in more general ways.\n", "\n", "For example, a (made-up) function `multiply_by_five(2)` may take a single argument 2 and multiply it by five. An alternative function `multiply(2, 5)` may have two arguments, but be more generalizable to other multiplication tasks.\n", "\n", @@ -34,11 +34,29 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 1, "metadata": { "scrolled": true }, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "one two\n" + ] + }, + { + "data": { + "text/plain": [ + "3.142" + ] + }, + "execution_count": 1, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "len('before')\n", "print('one', 'two')\n", @@ -49,23 +67,49 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "An argument with out the proper number of arguments will give an error, which will give some information about what arguments you need for the function to be successful." + "A function without the proper number of arguments will give an error, which will give some information about what arguments you need for the function to be successful.\n", + "\n", + "**Question:** Look at the errors below. From the error message, how many arguments does the function take?" ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "ename": "TypeError", + "evalue": "len() takes exactly one argument (0 given)", + "output_type": "error", + "traceback": [ + "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", + "Input \u001b[1;32mIn [9]\u001b[0m, in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[0m \u001b[38;5;28;43mlen\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n", + "\u001b[1;31mTypeError\u001b[0m: len() takes exactly one argument (0 given)" + ] + } + ], "source": [ "len()" ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "ename": "TypeError", + "evalue": "round() takes at most 2 arguments (3 given)", + "output_type": "error", + "traceback": [ + "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", + "Input \u001b[1;32mIn [10]\u001b[0m, in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[0m \u001b[38;5;28;43mround\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m3.14\u001b[39;49m\u001b[43m,\u001b[49m\u001b[38;5;241;43m1\u001b[39;49m\u001b[43m,\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m)\u001b[49m\n", + "\u001b[1;31mTypeError\u001b[0m: round() takes at most 2 arguments (3 given)" + ] + } + ], "source": [ "round(3.14,1,0)" ] @@ -79,9 +123,20 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "round" ] @@ -104,9 +159,17 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "3\n" + ] + } + ], "source": [ "print(round(3.14))" ] @@ -120,9 +183,21 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "ename": "TypeError", + "evalue": "object of type 'int' has no len()", + "output_type": "error", + "traceback": [ + "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", + "Input \u001b[1;32mIn [13]\u001b[0m, in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[0m \u001b[38;5;28;43mlen\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mround\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m3.14\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m)\u001b[49m\n", + "\u001b[1;31mTypeError\u001b[0m: object of type 'int' has no len()" + ] + } + ], "source": [ "len(round(3.14))" ] @@ -133,16 +208,30 @@ "source": [ "## Challenge 1: Errors in Nested Functions\n", "\n", - "The following code gives an error. What type of error is it? How can we fix it?" + "The following code gives an error. What type of error is it? How can we fix it?\n", + "\n", + "**Hint:** `max()` takes two integers or floats as input and returns the maximum value as output.\n", + "\n", + "\n", + "**Bonus:** What is the code trying to do? What do you predict the output to be?" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 16, "metadata": {}, - "outputs": [], + "outputs": [ + { + "ename": "SyntaxError", + "evalue": "unexpected EOF while parsing (3071631582.py, line 1)", + "output_type": "error", + "traceback": [ + "\u001b[1;36m Input \u001b[1;32mIn [16]\u001b[1;36m\u001b[0m\n\u001b[1;33m print(max(len('hi', len('hello'))\u001b[0m\n\u001b[1;37m ^\u001b[0m\n\u001b[1;31mSyntaxError\u001b[0m\u001b[1;31m:\u001b[0m unexpected EOF while parsing\n" + ] + } + ], "source": [ - "print(max(len('hi'), len('hello'))" + "print(max(len('hi', len('hello'))" ] }, { @@ -153,7 +242,8 @@ "\n", "The type and content of an argument must be compatible with the function.\n", "* For example, taking the maximum of no inputs - `max()` - is a meaningless.\n", - "* Furthermore, `max()` requires types that are comparable (i.e., strings and ints can't be compared to each other)." + "* Furthermore, `max()` requires types that are comparable (i.e., strings and ints can't be compared to each other).\n", + "* Trying to predict what the output will be can help catch these kinds of errors early on." ] }, { @@ -182,15 +272,28 @@ "\n", "Some functions do not require you to enter a value for each argument. In these cases, it will use a **default argument** specified in the function.\n", "\n", - "* For example, `round()` will round an inputted floating-point number. It accepts two arguments: the number, and the number of decimal places to round off to.\n", - "* By default, it rounds to zero decimal places." + "* For example, `round()` will round a number. It accepts two arguments: the number, and the number of decimal places to round off to.\n", + "* By default, it rounds to zero decimal places.\n", + "\n", + "**Question:** Where do you think we look to find what the default arguments are?" ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "round(3.712)" ] @@ -204,9 +307,20 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "3.7" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "round(3.712, 1)" ] @@ -220,7 +334,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 20, "metadata": {}, "outputs": [], "source": [ @@ -236,19 +350,51 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "3.0" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#this works\n", "round(3.000, 2)" ] }, { - "cell_type": "code", - "execution_count": null, + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ + "**Question:** Why does the code below give an error? " + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": {}, + "outputs": [ + { + "ename": "TypeError", + "evalue": "'float' object cannot be interpreted as an integer", + "output_type": "error", + "traceback": [ + "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", + "Input \u001b[1;32mIn [24]\u001b[0m, in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[38;5;66;03m#this doesn't\u001b[39;00m\n\u001b[1;32m----> 2\u001b[0m \u001b[38;5;28;43mround\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m2\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m3.000\u001b[39;49m\u001b[43m)\u001b[49m\n", + "\u001b[1;31mTypeError\u001b[0m: 'float' object cannot be interpreted as an integer" + ] + } + ], + "source": [ + "#this doesn't\n", "round(2, 3.000)" ] }, @@ -276,9 +422,20 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 25, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(0, 5)" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "divmod(5, 16)" ] @@ -294,9 +451,24 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 26, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Help on built-in function round in module builtins:\n", + "\n", + "round(number, ndigits=None)\n", + " Round a number to a given precision in decimal digits.\n", + " \n", + " The return value is an integer if ndigits is omitted or None. Otherwise\n", + " the return value has the same type as the number. ndigits may be negative.\n", + "\n" + ] + } + ], "source": [ "help(round)" ] @@ -307,7 +479,7 @@ "source": [ "## Every Function Returns a Value\n", "\n", - "* Every [function call](https://github.com/dlab-berkeley/python-intensive/blob/master/Glossary.md#function-call) produces some result.\n", + "* Every [function call](https://github.com/dlab-berkeley/Python-Fundamentals/blob/main/glossary.md) produces some result.\n", "* If the function doesn't have a useful result to return,\n", " it usually returns the special value `None`.\n", "* Unless the goal of the function is to print results, you usually want to save the output so you can refer to it later" @@ -315,9 +487,17 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 27, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The output of divmod(16, 5) is (3, 1)\n" + ] + } + ], "source": [ "output = divmod(16, 5)\n", "print('The output of divmod(16, 5) is', output)" @@ -329,10 +509,10 @@ "source": [ "## Functions, Objects, and Methods\n", " \n", - "Some Python vocabulary: \n", + "We've already covered most of these topics, but let's take a look at their definitions below:\n", "\n", "* A **function** is a block of code that can be reused. It can be passed data to operate on (ie. the arguments) and can optionally return data (the return value).\n", - "* An **object** is a collection of conceptually related variables and functions using those variables. Every [object](https://github.com/dlab-berkeley/python-intensive/blob/master/Glossary.md#object) is an instance of a `class`, which is like a blueprint for an object. \n", + "* An **object** is a collection of conceptually related variables and functions using those variables. Every [object](https://github.com/dlab-berkeley/Python-Fundamentals/blob/main/glossary.md) is an instance of a `class`, which is like a blueprint for an object. \n", "* A **method** is a function which is tied to a particular object. Each of an object's methods typically implements one of the things it can do, or one of the questions it can answer. It is called using the dot notation: e.g. `object.method()`.\n", "\n", "With these definitions in mind, note the following:\n", @@ -342,7 +522,7 @@ " \n", "Read more about objects, classes and methods [here](https://docs.python.org/3/tutorial/classes.html).\n", "\n", - "Check out our Python glossary [here](https://github.com/dlab-berkeley/python-intensive/blob/master/Glossary.md)." + "Check out our Python glossary [here](https://github.com/dlab-berkeley/Python-Fundamentals/blob/main/glossary.md)." ] }, { @@ -351,22 +531,38 @@ "source": [ "## Challenge 3: Nested Functions\n", "\n", - "1. Predict what each of the `print` statements in the program below will print.\n", + "1. Predict what each of the `print` statements in the program below will print. Run the code. If it doesn't match your expectations, write out each step in the code.\n", "2. Does `max(str(len(rich)), poor)` run or produce an error message?\n", " If it runs, does its result make any sense?" ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 28, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "tin\n", + "4\n" + ] + } + ], "source": [ "rich = \"gold\"\n", "poor = \"tin\"\n", "print(max(rich, poor))\n", "print(max(len(rich), len(poor)))" ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { @@ -385,7 +581,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.12" + "version": "3.9.12" } }, "nbformat": 4, diff --git a/lessons/Part2/06_data_structures.ipynb b/lessons/Part2/06_data_structures.ipynb deleted file mode 100644 index b33b349..0000000 --- a/lessons/Part2/06_data_structures.ipynb +++ /dev/null @@ -1,522 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Python Data Structures: Lists, Dictionaries, and Data Frames\n", - "\n", - "**Learning Objectives**\n", - "* Introduce some major data structures in Python: Lists, dictionaries, and data frames.\n", - "* Practice interacting with and manipulating these data structures.\n", - "\n", - "\n", - "Data structures are objects that organize and store data in a useful way. They're a bedrock of data analysis in programming. We'll cover three of the fundamental data structures in this lesson: lists, dictionaries, and data frames.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Lists: Ordered Data Structures\n", - "\n", - "The first data structure we consider is a **list**. Lists are a collection of **ordered** items. Lists have a length, and the constituent items can be **indexed** based on their positions.\n", - "\n", - "They're most useful when storing a collection of values, when order is important. One nice thing about lists is that they can contain different types of data. For example, the entries of a list can be integers, floats, strings, and even other lists!\n", - "\n", - "We specify a list with square brackets: `[]`" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "country_list = [\"Afghanistan\", \"Canada\", \"Thailand\", \"Denmark\", \"Japan\"]\n", - "type(country_list)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "len(country_list)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can index a list using square brackets followng the list name, using the notation `[start:stop]`. Using a colon indicates that you want all entries between the two indices. If one side of the colon is empty, it indicates using one end of the list as the starting or ending points.\n", - "\n", - "Python is *zero*-indexed, meaning the first entry has index zero, not one! In addition, the `stop` index indicates 'up to but not including'. So, in `list[start:stop]`, `list[stop]` is not included." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print(country_list[0])\n", - "print(country_list[1:4])\n", - "print(country_list[1:])\n", - "print(country_list[:4])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can also use negative numbers to indicate starting points relative to the last entry in the list:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print(country_list[-1])\n", - "print(country_list[-4:-1])\n", - "print(country_list[-4:])\n", - "print(country_list[:-1])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Challenge 1: Slicing Lists\n", - "\n", - "Using the lists in the next cell:\n", - "\n", - "1. What does `thing[start:stop]` do? What is the output?\n", - "2. Write three different ways to slice the string from 'elephant' to the end." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", - "start = 2\n", - "stop = 5" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### List Methods\n", - "\n", - "As we discussed in Part 1, objects can have methods associated with their data type that are accessed via dot notation (`object.method()`).\n", - "\n", - "Methods are functions that operate specifically on a particular data type. Lists have their own methods which perform operations specific to the structure of lists. The most common method is the `append()` method, which adds an item to the end of a list." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print(country_list)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "country_list.append('USA')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print(country_list)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Note that the `append()` method operates **in-place**: it modifies the object it is applied to. This is not always the case in Python - some methods return an object that must be stored in its own variable.\n", - "\n", - "There are many other useful list methods. Use the [documentation](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) to investigate available methods for dealing with lists." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Challenge 2: Appending to Lists\n", - "\n", - "We've created a list called `thing` in the cell below.\n", - "\n", - "1. Append the following values to the list, individually: `'apple'`, `8`, and `9`. Print the ensuing list out.\n", - "2. Make a new list called `thing2` consisting of the values `'apple'`, `8`, and `9`. Append `thing2` to `thing`. How does the output differ from the output from the previous part?\n", - "3. Look at the [documentation](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) for the list method `.extend()`. Is there a way to rewrite your answer to (2) to use extend? How does that compare to the outputs of the previous two parts?\n", - "4. What is one situation where you would use `append` and one where you would use `extend`?\n", - "\n", - "**Hint**: *Iterable* in Python means an object with multiple values that can be iterated through (including lists, tuples, and even strings)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#1\n", - "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", - "\n", - "# YOUR CODE HERE\n", - "\n", - "#2\n", - "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", - "thing2 = []\n", - "\n", - "\n", - "#3\n", - "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", - "thing2 = []\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Dictionaries: Key-Value Structures\n", - "\n", - "Dictionaries are organized on the principle of key-value pairs. The **keys** can be used to access the **values**. They're most useful when you have unordered data organized in pairs. This occurs, for example, in specifying metadata (data describing other data).\n", - "\n", - "Keys can be ints, floats, or strings, and are unordered. Values, however, can be any data type.\n", - "\n", - "Dictionaries are specified in Python using curly braces, with colons separating keys and values. Let's take a look at an example dictionary." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "example_dict = {\n", - " \"name\": \"Forough Farrokhzad\",\n", - " \"year of birth\": 1935,\n", - " \"year of death\": 1967,\n", - " \"place of birth\": \"Iran\",\n", - " \"language\": \"Persian\"}\n", - "\n", - "example_dict['year of birth']" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Like lists, dictionaries have their own methods. One of them is the `keys()` method:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print(example_dict.keys())" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Remember how we did type conversion? We can do the same thing here, and cast the dictionary keys to a list, which we can then iterate through:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "list(example_dict.keys())" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Challenge 3: Creating a Dictionary\n", - "\n", - "Create a dictionary `fruits` with the following lists. Use the names of the list for the keys of the dictionary. Print the list of keys of the dictionary." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fruit = ['apple', 'orange', 'mango']\n", - "length = [3.2, 2.1, 3.1]\n", - "color = ['red', 'orange', 'yellow']" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Dictionaries are useful for hierarchical storage of data (and can even be nested!). They are also often used to initialize data frames, a useful data structure for tabular data." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Data Frames\n", - "\n", - "A common data structure you've likely already encountered is tabular data. Think of an Excel sheet: each column corresponds to a different feature of each datapoint, while rows correspond to different samples.\n", - "\n", - "In scientific programming, tabular data is often called a \"data frame\". In Python, there a specialized library called `pandas` which provides tools to create and manipulate data frames.\n", - "\n", - "We're going to explore `pandas` more closely in Part 3, but let's try creating a `pandas` `DataFrame` object right now. We'll do this by creating a dictionary:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fruit = ['apple', 'orange', 'mango', 'strawberry', 'salmonberry', 'thimbleberry']\n", - "size = [3, 2, 3, 1, 1, 1]\n", - "color = ['red', 'orange', 'orange', 'red', 'orange', 'red']\n", - "\n", - "fruits = {\n", - " 'fruit': fruit,\n", - " 'size': size,\n", - " 'color': color}" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, we import the `pandas` library and pass in the dictionary to the `pd.DataFrame()` function, storing the result in a variable called `df`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "\n", - "df = pd.DataFrame(fruits)\n", - "df" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The keys became column names and the values became cells in the `DataFrame`. In addition, there is an *index* on the left that keeps track of the row.\n", - "\n", - "Objects can also have **attributes**, or variables associated with the datatype. We can get the number of columns and rows with `df.shape`, an attribute of the dataframe. How many rows and columns does this dataframe have? " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df.shape" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Challenge 4: Initializing a DataFrame\n", - "\n", - "The following code gives an error. Why does it have an error? What are some ways we could fix this?" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fruit = ['apple', 'orange']\n", - "length = [3.2, 2.1, 3.1]\n", - "color = ['red', 'orange', 'yellow']\n", - "\n", - "fruit_dict = {\n", - " 'fruit': fruit,\n", - " 'length': length,\n", - " 'color': color}\n", - "\n", - "df_fruit = pd.DataFrame(fruit_dict)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### DataFrame Slicing and Methods\n", - "\n", - "We can choose a single column by selecting the name of that column. `pandas` calls this a `pd.Series` object. The act of obtaining a particular subset of a data frame is often referred to as **slicing**. This uses bracket notation to select part of the data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Bracket notation to choose a column\n", - "df['fruit']" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Specify each dimension of the data frame separately using the loc method\n", - "df.loc[:, 'fruit'] # Colon selects all rows" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We can choose a row by using the `loc` method with the first entry: `df.loc[index, :]`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Select the first row\n", - "df.loc[0, :]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Select a single cell\n", - "df.loc[0, 'fruit']" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "`DataFrame`s also have methods, including those for [merging](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html?highlight=merge#pandas.DataFrame.merge), [aggregation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html), [nulls](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html), and others. For example, we can identify the number of unique values in each column by using `nunique()`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df.nunique()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We can also count how many unique values of each type for a column using `df.value_counts()`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "scrolled": true - }, - "outputs": [], - "source": [ - "df.value_counts(['color'])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "There are many many more methods and operations for Pandas DataFrames. Check out our Data Wrangling with Python workshop for more on DataFrames (and part 3 of this workshop1)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Other Data Structures\n", - "\n", - "There are many more data structures in Python that you may run across. A few include:\n", - "\n", - "* **tuple**: Similar to a list, but values can't be changed. Tuples are **immutable**.\n", - "* **set**: An unordered list, which can only contain **unique** values.\n", - "* **range**: A sequence of numbers, often in an arithmetic sequence. \n", - "* And many [more](https://docs.python.org/3/library/stdtypes.html#immutable-sequence-types)!\n", - "\n", - "\n", - "We often interact with these more often as the output of functions rather than writing them ourselves, but it's good to be aware of them. " - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.12" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/lessons/Part2/06_lists.ipynb b/lessons/Part2/06_lists.ipynb new file mode 100644 index 0000000..d0ccbb6 --- /dev/null +++ b/lessons/Part2/06_lists.ipynb @@ -0,0 +1,247 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Python Data Structures: Lists\n", + "**Learning Objectives**\n", + "* Introduce lists as an ordered set of data.\n", + "* Practice interacting with and manipulating lists.\n", + "\n", + "\n", + "Data structures are objects that organize and store data in a useful way. They're a bedrock of data analysis in programming. We'll cover one of the fundamental data structures in this lesson (lists) and two more (dictionaries and DataFrames) in the next lesson.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Lists: Ordered Data Structures\n", + "\n", + "Lists are a collection of **ordered** items. Lists have a length, and the items inside can be **indexed**, or accessed based on their positions.\n", + "\n", + "They're most useful when storing a collection of values when order is important. One nice thing about lists is that they can contain different types of data. For example, the entries of a list can be integers, floats, strings, and even other lists!\n", + "\n", + "We specify a list with square brackets: `[]` and commas separating each entry in the list." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "country_list = [\"Afghanistan\", \"Canada\", \"Thailand\", \"Denmark\", \"Japan\"]\n", + "type(country_list)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`len()` gives the number of items in a list. What is the output of the line below?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "len(country_list)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can index a list using square brackets followng the list name, using the notation `[start:stop]`. **Note:** This uses square brackets rather than parentheses. Using a colon indicates that you want to access all entries between the two endpoints. If one side of the colon is empty, it indicates using one end of the list as the starting or ending points. \n", + "\n", + "**Question:** Below for some examples of indexing on `country_list`. For each entry predict what the output will be, then check your predictions.\n", + "\n", + "\n", + "**Hint:** Python is *zero*-indexed, meaning the first entry has index zero, not one! In addition, the `stop` index indicates 'up to but not including'. So, in `list[start:stop]`, `list[stop]` is not included.\n", + "\n", + "**Bonus:** What is the type of each of the outputs below?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(country_list[0])\n", + "print(country_list[1:4])\n", + "print(country_list[1:])\n", + "print(country_list[:4])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Bonus:** You can also use negative numbers to indicate starting points relative to the last entry in the list:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(country_list[-1])\n", + "print(country_list[-4:-1])\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Challenge 1: Slicing Lists\n", + "\n", + "Using the lists in the next cell:\n", + "\n", + "1. What does `thing[start:stop]` do? What is the output?\n", + "2. Write three different ways to slice the string from `elephant` to the end." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", + "start = 2\n", + "stop = 5" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### List Methods\n", + "\n", + "As we discussed in Part 1, objects can have methods associated with their data type that are accessed via dot notation (`object.method()`).\n", + "\n", + "Recall that methods are functions that operate specifically on a particular data type. Lists have their own methods which perform operations specific to lists. The most common method is the `append()` method, which adds an item to the end of a list. \n", + "\n", + "The code below adds a country to `country_list` using `append()`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(country_list)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "country_list.append('USA')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(country_list)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note that the `append()` method operates **in-place**: it modifies the object it is applied to. This is not always the case in Python - some methods return an object that must be stored in its own variable.\n", + "\n", + "There are many other useful list methods. Use the [documentation](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) to investigate available methods for dealing with lists." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Challenge 2: Adding to Lists\n", + "\n", + "We've created a list called `thing` in the cell below.\n", + "\n", + "1. Append the following values to the list, individually: `'apple'`, `8`, and `9`. Print the ensuing list out.\n", + "2. We've made another list called `thing2` consisting of the values `'apple'`, `8`, and `9`. Append `thing2` to `thing`. How does the output differ from the output from the previous part?\n", + "3. Look at the [documentation](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) for the list method `.extend()`. Rewrite your answer to (2) to use extend. How does that compare to the outputs of the previous two parts?\n", + "4. What is one situation where you would use `append` and one where you would use `extend`?\n", + "\n", + "**Hint**: *Iterable* in Python means an object with multiple values that can be iterated through (including lists, tuples, and even strings)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#1\n", + "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", + "\n", + "#....append(...)\n", + "#2\n", + "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", + "thing2 = ['apple',8,9]\n", + "#....append(...)\n", + "\n", + "\n", + "#3\n", + "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", + "thing2 = ['apple',8,9]\n", + "#....extend(...)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Other Data Structures\n", + "\n", + "There are many more data structures in Python that you may run across. We will cover two more in the next section: dictionaries and DataFrames. There are several more that you may run across but we will not cover in this workshop. They include:\n", + "\n", + "* **tuple**: Similar to a list, but values can't be changed onced it is made. Tuples are **immutable**.\n", + "* **set**: An unordered list, which can only contain **unique** values.\n", + "* **range**: A sequence of numbers, often in an arithmetic sequence. \n", + "* And many [more](https://docs.python.org/3/library/stdtypes.html#immutable-sequence-types)!\n", + "\n", + "\n", + "We often interact with these more often as the output of functions rather than writing them ourselves, but it's good to be aware of them. " + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.12" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/lessons/Part2/07_dictionaries_and_dataframes.ipynb b/lessons/Part2/07_dictionaries_and_dataframes.ipynb new file mode 100644 index 0000000..a7ef296 --- /dev/null +++ b/lessons/Part2/07_dictionaries_and_dataframes.ipynb @@ -0,0 +1,473 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Python Data Structures: Dictionaries and Data Frames\n", + "\n", + "**Learning Objectives**\n", + "* Introduce dictionaries, and data frames.\n", + "* Practice interacting with and manipulating these data structures.\n", + "\n", + "Dictionaries and DataFrames are two other key types of data structure. We will cover each of these in the sections below.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Dictionaries: Key-Value Structures\n", + "\n", + "Dictionaries are organized on the principle of key-value pairs. The **keys** can be used to access the **values**. They're most useful when you have unordered data organized in pairs. This occurs, for example, in storing metadata (data describing other data).\n", + "\n", + "Keys can be ints, floats, or strings, and are unordered. Values, however, can be any data type.\n", + "\n", + "Dictionaries are specified in Python using curly braces, with colons separating keys and values. \n", + "\n", + "Let's take a look at an example dictionary:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "1935" + ] + }, + "execution_count": 1, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "example_dict = {\n", + " \"name\": \"Forough Farrokhzad\",\n", + " \"year of birth\": 1935,\n", + " \"year of death\": 1967,\n", + " \"place of birth\": \"Iran\",\n", + " \"language\": \"Persian\"}\n", + "\n", + "example_dict['year of birth']" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Like lists, dictionaries have their own methods. One of them is the `keys()` method. What is the type of the output of `.keys()`?" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "dict_keys(['name', 'year of birth', 'year of death', 'place of birth', 'language'])\n" + ] + }, + { + "data": { + "text/plain": [ + "dict_keys" + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "print(example_dict.keys())\n", + "type(example_dict.keys())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`dict_keys` is a type that we haven't encountered before, so it will be hard to work with this type directly. However, recall that we can use type conversion to change the type of a variable. We can **cast** (or change the type of) the dictionary keys to a list, which is a type that we are more familiar with:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['name', 'year of birth', 'year of death', 'place of birth', 'language']" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "list(example_dict.keys())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Challenge 1: Creating a Dictionary\n", + "\n", + "Create a dictionary `fruits` with the following lists. Use the names of each list for the keys of the dictionary. Print the list of keys of the dictionary." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "fruit = ['apple', 'orange', 'mango']\n", + "length = [3.2, 2.1, 3.1]\n", + "color = ['red', 'orange', 'yellow']\n", + "\n", + "\n", + "\n", + "# YOUR CODE HERE" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Dictionaries are useful for hierarchical storage of data (and can even be nested just like lists!). They are also often used to initialize data frames, a useful data structure for tabular data, and essential for data scientists." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Data Frames\n", + "\n", + "A common data structure you've likely already encountered is tabular data. Think of an Excel sheet: each column corresponds to a different feature of each datapoint, while rows correspond to different samples.\n", + "\n", + "In scientific programming, tabular data is often called a \"data frame\". In Python, there a specialized library called `pandas` which contains an object `DataFrame` that implements this data structure.\n", + "\n", + "We're going to explore `pandas` more closely in Part 3, but let's try creating a `DataFrame` object right now. \n", + "\n", + "\n", + "First, we need to create a dictionary:\n", + "\n", + "**Note:** You can also substitute in your answer for Challenge 1 below." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "fruit = ['apple', 'orange', 'mango', 'strawberry', 'salmonberry', 'thimbleberry']\n", + "size = [3, 2, 3, 1, 1, 1]\n", + "color = ['red', 'orange', 'orange', 'red', 'orange', 'red']\n", + "\n", + "fruits = {\n", + " 'fruit': fruit,\n", + " 'size': size,\n", + " 'color': color}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we import the `pandas` **library** (We will cover libraries in more detail in Part 3) and pass in the dictionary to the `pd.DataFrame()` function, storing the result in a variable called `df`." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
fruitsizecolor
0apple3red
1orange2orange
2mango3orange
3strawberry1red
4salmonberry1orange
5thimbleberry1red
\n", + "
" + ], + "text/plain": [ + " fruit size color\n", + "0 apple 3 red\n", + "1 orange 2 orange\n", + "2 mango 3 orange\n", + "3 strawberry 1 red\n", + "4 salmonberry 1 orange\n", + "5 thimbleberry 1 red" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import pandas as pd\n", + "\n", + "df = pd.DataFrame(fruits)\n", + "df" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The keys became column names and the values became cells in the `DataFrame`. In addition, there is an **index** on the left that keeps track of the row.\n", + "\n", + "Objects can also have **attributes**, or variables associated with the data type. We can get the number of columns and rows with `df.shape`, an attribute of the dataframe. \n", + "\n", + "**Question:** How many rows and columns does this dataframe have? " + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(6, 3)" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df.shape" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Challenge 2: Initializing a DataFrame\n", + "\n", + "The following code gives an error. Why does it have an error? What are some ways to fix this?" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "ename": "ValueError", + "evalue": "All arrays must be of the same length", + "output_type": "error", + "traceback": [ + "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", + "Input \u001b[1;32mIn [10]\u001b[0m, in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 3\u001b[0m color \u001b[38;5;241m=\u001b[39m [\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mred\u001b[39m\u001b[38;5;124m'\u001b[39m, \u001b[38;5;124m'\u001b[39m\u001b[38;5;124morange\u001b[39m\u001b[38;5;124m'\u001b[39m, \u001b[38;5;124m'\u001b[39m\u001b[38;5;124myellow\u001b[39m\u001b[38;5;124m'\u001b[39m]\n\u001b[0;32m 5\u001b[0m fruit_dict \u001b[38;5;241m=\u001b[39m {\n\u001b[0;32m 6\u001b[0m \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mfruit\u001b[39m\u001b[38;5;124m'\u001b[39m: fruit,\n\u001b[0;32m 7\u001b[0m \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mlength\u001b[39m\u001b[38;5;124m'\u001b[39m: length,\n\u001b[0;32m 8\u001b[0m \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mcolor\u001b[39m\u001b[38;5;124m'\u001b[39m: color}\n\u001b[1;32m---> 10\u001b[0m df_fruit \u001b[38;5;241m=\u001b[39m \u001b[43mpd\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mDataFrame\u001b[49m\u001b[43m(\u001b[49m\u001b[43mfruit_dict\u001b[49m\u001b[43m)\u001b[49m\n", + "File \u001b[1;32m~\\anaconda32022\\lib\\site-packages\\pandas\\core\\frame.py:636\u001b[0m, in \u001b[0;36mDataFrame.__init__\u001b[1;34m(self, data, index, columns, dtype, copy)\u001b[0m\n\u001b[0;32m 630\u001b[0m mgr \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_init_mgr(\n\u001b[0;32m 631\u001b[0m data, axes\u001b[38;5;241m=\u001b[39m{\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mindex\u001b[39m\u001b[38;5;124m\"\u001b[39m: index, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcolumns\u001b[39m\u001b[38;5;124m\"\u001b[39m: columns}, dtype\u001b[38;5;241m=\u001b[39mdtype, copy\u001b[38;5;241m=\u001b[39mcopy\n\u001b[0;32m 632\u001b[0m )\n\u001b[0;32m 634\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(data, \u001b[38;5;28mdict\u001b[39m):\n\u001b[0;32m 635\u001b[0m \u001b[38;5;66;03m# GH#38939 de facto copy defaults to False only in non-dict cases\u001b[39;00m\n\u001b[1;32m--> 636\u001b[0m mgr \u001b[38;5;241m=\u001b[39m \u001b[43mdict_to_mgr\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdata\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mindex\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcolumns\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdtype\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdtype\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcopy\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcopy\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mtyp\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmanager\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m 637\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(data, ma\u001b[38;5;241m.\u001b[39mMaskedArray):\n\u001b[0;32m 638\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mnumpy\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mma\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mmrecords\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mmrecords\u001b[39;00m\n", + "File \u001b[1;32m~\\anaconda32022\\lib\\site-packages\\pandas\\core\\internals\\construction.py:502\u001b[0m, in \u001b[0;36mdict_to_mgr\u001b[1;34m(data, index, columns, dtype, typ, copy)\u001b[0m\n\u001b[0;32m 494\u001b[0m arrays \u001b[38;5;241m=\u001b[39m [\n\u001b[0;32m 495\u001b[0m x\n\u001b[0;32m 496\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28mhasattr\u001b[39m(x, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdtype\u001b[39m\u001b[38;5;124m\"\u001b[39m) \u001b[38;5;129;01mor\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(x\u001b[38;5;241m.\u001b[39mdtype, ExtensionDtype)\n\u001b[0;32m 497\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m x\u001b[38;5;241m.\u001b[39mcopy()\n\u001b[0;32m 498\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m x \u001b[38;5;129;01min\u001b[39;00m arrays\n\u001b[0;32m 499\u001b[0m ]\n\u001b[0;32m 500\u001b[0m \u001b[38;5;66;03m# TODO: can we get rid of the dt64tz special case above?\u001b[39;00m\n\u001b[1;32m--> 502\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43marrays_to_mgr\u001b[49m\u001b[43m(\u001b[49m\u001b[43marrays\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcolumns\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mindex\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdtype\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdtype\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mtyp\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mtyp\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mconsolidate\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcopy\u001b[49m\u001b[43m)\u001b[49m\n", + "File \u001b[1;32m~\\anaconda32022\\lib\\site-packages\\pandas\\core\\internals\\construction.py:120\u001b[0m, in \u001b[0;36marrays_to_mgr\u001b[1;34m(arrays, columns, index, dtype, verify_integrity, typ, consolidate)\u001b[0m\n\u001b[0;32m 117\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m verify_integrity:\n\u001b[0;32m 118\u001b[0m \u001b[38;5;66;03m# figure out the index, if necessary\u001b[39;00m\n\u001b[0;32m 119\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m index \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m--> 120\u001b[0m index \u001b[38;5;241m=\u001b[39m \u001b[43m_extract_index\u001b[49m\u001b[43m(\u001b[49m\u001b[43marrays\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m 121\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m 122\u001b[0m index \u001b[38;5;241m=\u001b[39m ensure_index(index)\n", + "File \u001b[1;32m~\\anaconda32022\\lib\\site-packages\\pandas\\core\\internals\\construction.py:674\u001b[0m, in \u001b[0;36m_extract_index\u001b[1;34m(data)\u001b[0m\n\u001b[0;32m 672\u001b[0m lengths \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mlist\u001b[39m(\u001b[38;5;28mset\u001b[39m(raw_lengths))\n\u001b[0;32m 673\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(lengths) \u001b[38;5;241m>\u001b[39m \u001b[38;5;241m1\u001b[39m:\n\u001b[1;32m--> 674\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mAll arrays must be of the same length\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m 676\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m have_dicts:\n\u001b[0;32m 677\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\n\u001b[0;32m 678\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mMixing dicts with non-Series may lead to ambiguous ordering.\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m 679\u001b[0m )\n", + "\u001b[1;31mValueError\u001b[0m: All arrays must be of the same length" + ] + } + ], + "source": [ + "fruit = ['apple', 'orange']\n", + "length = [3.2, 2.1, 3.1]\n", + "color = ['red', 'orange', 'yellow']\n", + "\n", + "fruit_dict = {\n", + " 'fruit': fruit,\n", + " 'length': length,\n", + " 'color': color}\n", + "\n", + "df_fruit = pd.DataFrame(fruit_dict)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Working with DataFrames\n", + "\n", + "Pandas has hundreds of useful ways for us to work with DataFrames. We will cover a couple of general topics here and in Part 3, but for more on pandas, consider the Python Data Wrangling workshop. \n", + "\n", + "\n", + "We can choose a single column by selecting the name of that column. `pandas` calls this a `pd.Series` object. The act of obtaining a particular subset of a data frame is often referred to as **slicing**. This uses bracket notation to select part of the data.\n", + "\n", + "Check the type of the slice below:" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "scrolled": false + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0 apple\n", + "1 orange\n", + "2 mango\n", + "3 strawberry\n", + "4 salmonberry\n", + "5 thimbleberry\n", + "Name: fruit, dtype: object" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Bracket notation to choose a column\n", + "df['fruit']" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`DataFrame` objects also have methods, including those for [merging](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html?highlight=merge#pandas.DataFrame.merge), [aggregation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html), [nulls](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html), and others. Many of these functions operate on a single column of the DataFrame. For example, we can identify the number of unique values in each column by using `.nunique()`, and what those unique values are by using `.unique()`:" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2\n", + "['red' 'orange']\n" + ] + } + ], + "source": [ + "#number of unique colors in the df\n", + "print(df['color'].nunique())\n", + "\n", + "\n", + "#unique colors in the df\n", + "print(df['color'].unique())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Challenge 3: `value_counts()`\n", + "\n", + "There is another pandas function `.value_counts()` which can be used to help organize the information provided by `unique()`. Read the [documentation](https://pandas.pydata.org/docs/reference/api/pandas.Series.value_counts.html) and apply `value_counts()` to the `df` variable. How many 'red' and 'orange' fruits are in the DataFrame?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "## YOUR CODE HERE." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.12" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/lessons/Part2/07_loops.ipynb b/lessons/Part2/08_loops.ipynb similarity index 80% rename from lessons/Part2/07_loops.ipynb rename to lessons/Part2/08_loops.ipynb index 9ab1249..30db703 100644 --- a/lessons/Part2/07_loops.ipynb +++ b/lessons/Part2/08_loops.ipynb @@ -59,14 +59,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Which method do you prefer?" + "**Question:** Which method do you prefer?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Let's say that we have 1000 tire pressures in a list. Are we going to have to round each one by hand?\n", + "Let's say that we have 1000 tire pressures in a list. Using this method you would have to round each value separately.\n", "\n", "Our current approach is not particularly scalable. It's also not very flexible. For example, what if you want to round every tire pressure to two decimal places?" ] @@ -75,13 +75,13 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Loops Facilitate Repeated Computation\n", + "## Loops for repeated computation\n", "\n", "The strength of using computers is their speed. We can leverage this by facilitating repeated computation with **loops**. In programming, there are generally two kinds of loops: for loops and while loops. \n", "\n", "A **for loop** tells Python to execute some statements once *for* each value in a list, a character string, or some other set of values. Specifically, we structure our computation as: \"**for** each thing in this group, **do** these operations\".\n", "\n", - "In the above example, we would think of the for loop as: \"for each tire pressure, round it to $N$ decimal places\".\n", + "**Question:** How would we formulate this statement for the tire pressure problem above?\n", "\n", "The other major type of loop is a [while](https://www.geeksforgeeks.org/python-while-loop/) loop. We don't use these loops frequently in this type of programming, but we may encounter them. A while loop means 'while Condition A is true, repeat Computation'.\n", "\n", @@ -94,22 +94,9 @@ "metadata": {}, "outputs": [], "source": [ - "# We use the \"for\" and \"in\" keywords\n", - "# The block of the for loop is indicated by a colon\n", - "for pressure in [40.9, 35.2, 28.4]:\n", - " # Indentation is very important! It's how Python knows that this is inside the for loop\n", - " print(round(pressure))\n", + "# We use a variable containing a list with the values to be iterated through\n", + "tires = [41,35,28]\n", "\n", - "print('the loop has ended')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# We can also substitute a variable containing a list with the values to be iterated through\n", "for pressure in tires:\n", " print(round(pressure))\n", "\n", @@ -143,20 +130,6 @@ " * Placeholders for the loop." ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "tires = [40.9,35.2,28.4]\n", - "\n", - "for pressure in tires:\n", - " p=round(pressure)\n", - " print('original:',pressure,'rounded:',p)\n", - "print('the loop has ended')\n" - ] - }, { "cell_type": "markdown", "metadata": {}, @@ -182,7 +155,7 @@ "source": [ "## Loops with Strings, Series, and `range`\n", "\n", - "Loops can loop over any iterable data type. An **iterable** is any data type that can be iterated over, like a sequence. A rule of thumb is that anything that can be indexed (e.g. accessed with `values[i]`) is an iterable.\n", + "Loops can loop over any iterable data type. An **iterable** is any data type that can be iterated over, like a sequence. Generally anything that can be indexed (e.g. accessed with `values[i]`) is an iterable.\n", "\n", "For example, a string is iterable, so it is possible to loop through a string!\n", "\n", @@ -201,61 +174,6 @@ " print(char.upper())" ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "There is also a built-in function called `range` which produces a sequence of numbers. This is *not* a list: the numbers are produced on demand to make looping over large ranges more efficient. \n", - "\n", - "A few use cases:\n", - "\n", - "* `range(N)` produces the integers from $0$ to $N-1$ (remember the zero indexing!). This is used frequently with range(len(list)) to iterate through each index in the list.\n", - "* `range(a, b)` produces the integers from $a$ to $b-1$.\n", - "* `range(a, b, x)` produces the integers from $a$ to $b-1$, with a spacing of $x$ between the outputs.\n", - "\n", - "The `range` function is one of the most common ways of iterating through a sequence, since it's so easy to generate values." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "evens = [0,2,4,6,8,10,12]\n", - "for idx in range(len(evens)):\n", - " print('index:',idx,'value:',evens[idx])\n", - " \n", - "print('The loop has ended.')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "evens = [0,2,4,6,8,10,12]\n", - "for idx in range(1,5):\n", - " print('index:',idx,'value:',evens[idx])\n", - " \n", - "print('The loop has ended.')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "evens = [0,2,4,6,8,10,12]\n", - "\n", - "for idx in range(1, 6, 2):\n", - " print('index:',idx,'value:',evens[idx])\n", - " \n", - "print('The loop has ended.')" - ] - }, { "cell_type": "markdown", "metadata": {}, @@ -271,9 +189,7 @@ "1. Extract the column `elevation` as a Series.\n", "2. Loop through the series.\n", "3. Convert each value to meters.\n", - "4. Print the result. \n", - "\n", - "**Bonus**: Use `range` to iterate through the series and achieve the same result. Can you think of any other ways to loop through the DataFrame?" + "4. Print the result. \n" ] }, { @@ -337,7 +253,7 @@ " \n", "The result of this is a single list, number, or string with a summary value for the entire collection being looped over.\n", "\n", - "For example, we can make a new list with all of the tire pressures rounded:" + "Returning to the tire pressure example, we can make a new list with all of the tire pressures rounded:" ] }, { @@ -567,7 +483,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.12" + "version": "3.9.12" } }, "nbformat": 4, diff --git a/lessons/Part2/08_conditionals.ipynb b/lessons/Part2/09_conditionals.ipynb similarity index 73% rename from lessons/Part2/08_conditionals.ipynb rename to lessons/Part2/09_conditionals.ipynb index 633843c..3a063b2 100644 --- a/lessons/Part2/08_conditionals.ipynb +++ b/lessons/Part2/09_conditionals.ipynb @@ -9,7 +9,7 @@ "\n", "**Learning Objectives**:\n", "\n", - "- Introduce the Boolean type.\n", + "- Introduce the **Boolean** type.\n", "- Define and understand conditional statements.\n", "- Understand the use of `if`, `else`, and `elif`.\n", "- Use a conditional inside of a loop.\n", @@ -22,7 +22,7 @@ "source": [ "## The Boolean Data Type\n", "\n", - "Booleans are a fundamental data type in programming. Booleans are binary, taking on values: `True` or `False`.\n", + "**Booleans** are a fundamental data type in programming. Booleans are varibles that are **binary**: they can either be `True` or `False`.\n", "\n", "Why do we use these? They're very useful for **control flow**: changing the course of a program depending on certain conditions. For example, we might make different decisions on what computation to perform based on the current state of the data, user preferences, etc. Booleans allow decision making in these contexts.\n", "\n", @@ -43,7 +43,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Booleans are commonly seen in the results of inequalities. " + "Booleans are commonly seen in the results of inequalities. Predict the outcome of the inequalities printed below, then run the code.\n", + "\n", + "**Note:** Equality is signaled in Python (and many other languages) by the double equals sign `==`. This is distinct from the **assignment operator** (single equals sign `=`) used in variable assignment (e.g. `year = 1996`)" ] }, { @@ -58,12 +60,6 @@ "# Less than\n", "print(\"Is 3 < 5?\", 3 < 5)\n", "\n", - "# Greater than or equal to\n", - "print(\"Is 3 <= 3?\", 3 <= 3)\n", - "\n", - "# Less than or equal to \n", - "print(\"Is 3 >= 3.1?\", 3 >= 3.1)\n", - "\n", "# Exactly equal to\n", "print(\"Is ice == ice?\", 'ice' == 'ice')" ] @@ -81,7 +77,6 @@ "metadata": {}, "outputs": [], "source": [ - "print(\"Is ice != ice?\", 'ice' != 'ice')\n", "print(\"Is ice != water?\", 'ice' != 'water')" ] }, @@ -89,7 +84,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Furthermore, you can *negate* a Boolean expression with the keyword `not`:" + "Furthermore, you can *negate* a Boolean expression with the keyword `not`. Predit the outcomes for the examples below:\n", + "\n", + "**Hint:** Recall `yes=True` and `no=False`." ] }, { @@ -99,9 +96,7 @@ "outputs": [], "source": [ "print(yes)\n", - "print(not yes)\n", - "print(no)\n", - "print(not no)" + "print(not yes)" ] }, { @@ -129,9 +124,27 @@ "\n", "print(a and b)\n", "\n", - "print(a or b)\n", + "print(a or b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Notice that when you are combining Boolean expressions, parentheses are used to indicate order of evaluation. The innermost parentheses are evaluated first, then the later ones. For example, compare the two lines below. \n", + "\n", + "**Question:** Why are the outputs different? Write the order of evaluation for each line below.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(not (a and b))\n", "\n", - "print(not (a and b))" + "print(not a and b)" ] }, { @@ -140,9 +153,9 @@ "source": [ "## Challenge 1: Boolean Errors\n", "\n", - "The following cell gives error(s). Identify each error and how to fix it. \n", + "1) The following cell gives error(s). Identify each error and how to fix it. \n", "\n", - "What is the output of the cell?" + "2) What is the output of the cell?" ] }, { @@ -166,7 +179,7 @@ "\n", "A **boolean mask** takes a conditional statement and generates a series with `True` where the condition is met, and `False` where the condition is not met. This is useful for working with tabular data like data frames.\n", "\n", - "Let's say we're working with the mountains data frame and we want to know which mountains have an elevation over 14200 feet. We can use a Boolean mask for this." + "It's often helpful to look at an example. Let's say we're working with a mountains `DataFrame` and we want to know which mountains have an elevation over 14200 feet. We can use a Boolean mask for this." ] }, { @@ -197,7 +210,9 @@ { "cell_type": "code", "execution_count": null, - "metadata": {}, + "metadata": { + "scrolled": true + }, "outputs": [], "source": [ "# Select the elevation column and apply a boolean mask\n", @@ -208,7 +223,24 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Run the cell below. What does the following code do? What does the result represent? " + "Let's add this as a column to `mountains_df`. We can add a column by assigning a series to a new column name in bracket notation. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "mountains_df['over_14200'] = mountains_df['elevation'] > 14200\n", + "mountains_df" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Question:** What does the following code do? What does the result represent? " ] }, { @@ -224,9 +256,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Using Boolean mask and sum is a quick trick that is useful for summarizing column data. There is implicit type conversion here: the Booleans are cast to integers.\n", + "Using Boolean mask and sum is a quick trick that is useful for summarizing column data. There is implicit type conversion here. What is it?\n", "\n", - "If we want to see the proportion of the data that matches the condition, we can take this one step farther. " + "If we want to see the proportion of the data that satisfies the condition, we can take this one step further. " ] }, { @@ -242,16 +274,16 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Conditionals: If Statements\n", + "## Conditionals: If-Statements\n", "\n", "A fundamental structure in programming is the **conditional**. These blocks allow different blocks of code to run, *conditional* on specific things being true.\n", "\n", - "The most widely used conditional is the **if statement**. An if statement controls whether some block of code is executed or not. Its structure is similar to that of a for loop: \n", + "The most widely used conditional is the **if-statement**. An if-statement controls whether some block of code is executed or not. Its structure is similar to that of a for loop: \n", "\n", - "* The first line opens with the `if` keyword and contains a Boolean variable or expression. It ends with a colon.\n", - "* The body, containing whatever code that runs if the Boolean expression is true, is indented.\n", + "* The first line opens with the `if` keyword and contains a Boolean variable or expression. It ends with a colon. If the expression evaluates to `True`, the block of code will run.\n", + "* The body, containing whatever code to execute if the condition is met, is indented.\n", "\n", - "So, if the Boolean expression is `True`, the body of an if statement is run. If not, it's skipped. Let's look at an example:" + "So, if the Boolean expression is `True`, the body of an if-statement is run. If not, it's skipped. Let's look at an example:" ] }, { @@ -277,7 +309,9 @@ "source": [ "## Conditionals and Loops\n", "\n", - "Conditionals are particularly useful when we're iterating through a list, and want to perform some operation only on specific components of that list that satisfy a condition we set." + "Conditionals are particularly useful when we're iterating through a list, and want to perform some operation only on specific components of that list that satisfy a certain condition.\n", + "\n", + "**Question:** what will the output of the following code be?" ] }, { @@ -297,9 +331,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Conditionals: Else Statements\n", + "## Conditionals: Else-statements\n", "\n", - "Else statements supplement if statements. They allow us to specify an alternative block of code to run if the if statement's conditional evaluates to `False`." + "Else-statements supplement if-statements. They allow us to specify an alternative block of code to run if the if-statement's conditional evaluates to `False`.\n", + "\n", + "**Question:** What is the difference between the following cell and the previous if statement. How will that affect the output?" ] }, { @@ -323,13 +359,13 @@ "source": [ "## Conditionals: Else-if Statements\n", "\n", - "We may want to check several conditionals at the same time. Else-if (Elif) statements allow us to specify as many conditional checks as we'd like in the same block.\n", + "We may want to check several conditionals at the same time. Else-if (Elif-) statements allow us to specify as many conditional checks as we'd like in the same block.\n", "\n", - "Else-if statements must follow an if statement. They only are checked if the if statement fails. Then, each else-if statement is checked, with their corresponding bodies run when the conditional evaluates to `True`.\n", + "Elif-statements must follow an if-statement. They only are checked if the if-statement fails. Then, each elif-statement is checked, with their corresponding bodies run when the conditional evaluates to `True`.\n", "\n", "An else statement at the end can act as a \"catch all\", when the if statement and all following else-if statements fail.\n", "\n", - "In Python, else if statements are indicated by the `elif` keyword:" + "In Python, else if statements are indicated by the `elif` keyword. Consider the following conditional cell." ] }, { @@ -373,7 +409,7 @@ "metadata": {}, "outputs": [], "source": [ - "scores = [80, 85, 99, 75, 70, 68]\n", + "scores = [85, 99, 77,68]\n", "\n", "for score in scores:\n", " if score >= 80:\n", @@ -390,16 +426,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The order of the if and elif statements matters. When one `if`/`elif` statement is met, all following statements are skipped. If there are multiple `if` statements, then each statement is evaluated separately. These kinds of errors won't give errors in the code, but they will give results that might not make sense, which can take longer to find and debug." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Challenge 3: If statements and aggregation\n", - "\n", - "From the list below, let's create a new list that has 0 where the value was negative, and the number at indices where the value is positve." + "The order of the if and elif statements matters. When one if/elif statement is met, all following statements are skipped. If there are multiple if statements, then each statement is evaluated separately. These kinds of errors won't give errors in the code, but they will give results that might not make sense, which can take longer to find and debug." ] }, { @@ -423,9 +450,8 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Challenge 4: String Conditionals\n", - "\n", - "Below, we've created a list of US Presidents. Create a a new list containing all Presidents whose last name starts with the letter B.\n", + "## Challenge 3: Conditionals and Aggregation\n", + "Below, we've created a list of US Presidents. Create a a new list containing all Presidents whose first name starts with the letter J. How many presidents are on this list?\n", "\n", "**Hint:** The `.split()` string function will be useful for this. Also, remember that strings are indexed: you can access any character of the string using bracket notation!" ] @@ -491,8 +517,19 @@ "metadata": {}, "outputs": [], "source": [ - "## Your code here" + "last_name_b = ___\n", + "for p in presidents:\n", + " if ___\n", + " ____.append(___)\n", + "print(last_name_b)" ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { @@ -511,7 +548,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.12" + "version": "3.9.12" } }, "nbformat": 4, diff --git a/lessons/Part2/09_custom_functions.ipynb b/lessons/Part2/10_custom_functions.ipynb similarity index 81% rename from lessons/Part2/09_custom_functions.ipynb rename to lessons/Part2/10_custom_functions.ipynb index 069c29f..843ab60 100644 --- a/lessons/Part2/09_custom_functions.ipynb +++ b/lessons/Part2/10_custom_functions.ipynb @@ -18,9 +18,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We have already used *built-in* functions like `len()` , `sum()`, `pd.DataFrame()`, in our code. These are essentially shortcuts that make it so that we don't need to write many lines of code to accomplish certain tasks. \n", + "We have already used **built-in functions** like `len()` , `sum()`, `pd.DataFrame()`, in our code. These are essentially shortcuts that make it so that we don't need to write many lines of code to accomplish certain tasks. \n", "\n", - "We didn't technically need these functions to perform their computation. For example, we can calculate the sum without relying on the function by using a for loop with aggregation:" + "We didn't technically need these functions to perform these tasks. For example, we can calculate the sum without relying on the function by using a for loop with aggregation:" ] }, { @@ -87,9 +87,9 @@ "\n", "Specifically, a function does three things:\n", "\n", - "1. They name pieces of code the way variables name strings and numbers.\n", - "2. They accept arguments, or inputs on which you'll operate. Arguments are also called parameters.\n", - "3. They return values that can be referred to in further operations.\n", + "1. It names pieces of code the way variables name strings and numbers.\n", + "2. It accepts arguments, or inputs on which you'll operate. Arguments are also called parameters.\n", + "3. It returns values that can be referred to in further operations.\n", "\n", "The details are pretty simple, but this is one of those ideas where it's good to get lots of practice!" ] @@ -104,14 +104,16 @@ "\n", "* Functions begin with the keyword `def`.\n", "* This keyword is followed by the function *name*.\n", - " * The name must obey the same rules as variable names.\n", - "* The *arguments* or *parameters* are defined in parentheses as variable names.\n", + " * The name must obey the same rules as variable names. (It is a variable!)\n", + "* The **arguments** or **parameters** are defined in parentheses as variable names.\n", " * Use empty parentheses if the function doesn't take any inputs.\n", "* A colon indicates the end of the function *signature*.\n", "* An indented block of code denotes the start of the *body*.\n", "* The final line should be a `return` statement with the value(s) to be returned from the function\n", "\n", - "**Note:** Arguments and variables created within the function only exist within the function and cannot be referred to unless returned by the function using the `return` statement." + "**Note:** Arguments and variables created within the function only exist within the function and cannot be referred to unless returned by the function using the `return` statement.\n", + "\n", + "Let's take a look at a simple function:" ] }, { @@ -128,7 +130,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Notice how there is no print statement from running the block of code above. This is because defining a function does not run it. You can think of it as assigning a value to a variable. The function needs to be *called* with appropriate arguments to execute the code it contains. \n", + "Notice how there is output from running the block of code above. This is because defining a function does not run it. You can think of it as assigning a value to a variable. The function needs to be **called**, or run, with appropriate arguments to execute the code it contains. \n", "\n", "Let's run this function. We can save the output to a variable and print the result." ] @@ -175,7 +177,9 @@ "\n", "These arguments become variables when the function is executed. The variables are assigned the values passed to the function. We do operations based on the arguments, and return the result.\n", "\n", - "Let's look at an example function in which we're performing division:" + "Let's look at an example function in which we're performing division.\n", + "\n", + "**Question:** What is being divided by what in the following lines of code?" ] }, { @@ -188,7 +192,7 @@ " return(x / y)\n", "\n", "print(divide(4, 6))\n", - "print(divide(6, 4))" + "print(divide(6, 4)) " ] }, { @@ -197,7 +201,7 @@ "source": [ "The order of the arguments matter; we got different results because each argument had a different role (numerator and denominator).\n", "\n", - "You can also pass in **keyword arguments**, where each argument is given a name. In this case, the order of the arguments doesn't matter, since each has a name associated with it. For example:" + "You can also pass in **keyword arguments**, where each argument is assigned using a name." ] }, { @@ -223,9 +227,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Challenge 2: Calling Keyword Arguments\n", + "## Default Arguments\n", "\n", - "Call the following function using the keyword arguments. Specifically, print the date corresponding to January 1, 2003." + "We can also specify **default arguments** in functions. When we provide a default argument, the function will use that value when the user does not pass in a value. Default arguments are specified in the function signature.\n", + "\n", + "An expanded version of the `divide()` function is provided below. What is the additional parameter doing? What will be the output of `divide(24,5)`?" ] }, { @@ -234,29 +240,32 @@ "metadata": {}, "outputs": [], "source": [ - "def print_date(year, month, day):\n", - " joined = str(year) + '/' + str(month) + '/' + str(day)\n", - " print(joined)" + "# y has default value equal to 10\n", + "def divide(x, y, z = True):\n", + " if z:\n", + " return(round(x / y))\n", + " else:\n", + " return(x/y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## Default Arguments\n", + "We can use default arguments when there are arguments that we will only want to change some of the time. It's good practice to make the default of the argument the item that you will want to use most often.\n", "\n", - "We can also specify **default arguments** in functions. When we provide a default argument, the function will use that value when the user does not pass in a value. Default arguments are specified in the function signature." + "**Question:** What do you think the best default for the `z` argument above would be? What might be a better name for that argument?" ] }, { - "cell_type": "code", - "execution_count": null, + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "# y has default value equal to 10\n", - "def divide(x, y=10):\n", - " return(x / y)" + "## Challenge 2: More Errors!\n", + "\n", + "Why do the following lines return errors?\n", + "\n", + "**Hint**: Think about what happens inside the function, and how the arguments plug into the function." ] }, { @@ -265,27 +274,7 @@ "metadata": {}, "outputs": [], "source": [ - "# User inputs on both values\n", - "print(divide(x=4, y=10))\n", - "# No input on y\n", - "print(divide(x=4))\n", - "# Unnamed inputs for both values\n", - "print(divide(4, 10))\n", - "# Unnamed input, with no second value passed in\n", - "print(divide(4))\n", - "# Unnamed first input, named second input\n", - "print(divide(4, y=10))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Challenge 3: More Errors!\n", - "\n", - "Why do the following lines return errors?\n", - "\n", - "**Hint**: Think about what happens inside the function, and how the arguments plug into the function." + "divide(z=False,10, 4)" ] }, { @@ -294,7 +283,7 @@ "metadata": {}, "outputs": [], "source": [ - "divide(y=10, 4)" + "divide(4, y='10')" ] }, { @@ -303,14 +292,14 @@ "metadata": {}, "outputs": [], "source": [ - "divide(4, y='10')" + "divide(4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "There's a lot of different permutations of arguments in functions, so keeping them organized " + "There's a lot of different permutations of arguments in functions, so keeping them organized will be helpful to both yourself and other people interacting with your code." ] }, { @@ -319,9 +308,7 @@ "source": [ "## Principles of Writing Your Own Functions\n", "\n", - "Function writing is one of the most important skills you can develop as a programmer. \n", - "\n", - "Here are some guidelines that can help minimize errors and make the process less painful:\n", + "Function writing is one of the most important skills you can develop as a programmer. However, there is also a lot that can go wrong in the function writing process, leading to time-consuming corrections. Here are some guidelines that can help minimize errors and make the process less painful:\n", "\n", "1. **Plan**\n", " 1. What is the overall goal of the function? Is there a function that exists already that does the same thing? \n", @@ -329,8 +316,8 @@ " 3. What arguments will you need? What pieces of the function do you need to control?\n", " 4. What are the general steps of the program? This can be written in bullet points or \"pseudocode\".\n", "2. **Write**\n", - " 1. Write the code without the function wrapper.\n", - " 2. Start small. Write small self-contained blocks of code and put the pieces together. You can also consider sub-functions.\n", + " 1. Start by writing the code without the function wrapper.\n", + " 2. Start small. Write small self-contained blocks of code and put the pieces together. You can also consider sub-functions if it is a particularly complex issue.\n", " 3. Test each part of the function as it is added. Track the input of the function and how it changes at each step. \n", " 4. Wrap the code in the function syntax.\n", "3. **Test**\n", @@ -351,7 +338,7 @@ "1. **Plan**\n", " 1. Parse a list of strings into two parts.\n", " 2. Input: list of strings\n", - " 3. Output: two lists, one of strings, one of ints\n", + " 3. Output: a DataFrame with two columns\n", " 4. The pseudocode might look like this: \n", " ``` \n", " function\n", @@ -380,6 +367,7 @@ " 'Alameda_2021.csv',\n", " 'San Francisco_2021.csv']\n", "\n", + "#choose 1 file\n", "test_file = files[1]" ] }, @@ -389,6 +377,7 @@ "metadata": {}, "outputs": [], "source": [ + "#split file into parts\n", "file_parts = test_file.split('.')[0].split('_')\n", "print(file_parts)" ] @@ -407,6 +396,9 @@ "outputs": [], "source": [ "file_parts = test_file.split('.')[0].split('_')\n", + "\n", + "\n", + "#separate out each piece of information that we need\n", "county = file_parts[0]\n", "year = file_parts[1]\n", "# Lowercase county\n", @@ -415,7 +407,7 @@ "year = int(year)\n", "# Check output\n", "print(county)\n", - "print(type(year))" + "print(year)" ] }, { @@ -441,6 +433,9 @@ "\n", "# Iterate over files\n", "for file in files:\n", + " \n", + " \n", + " ###This is all the same from before\n", " file_parts = test_file.split('.')[0].split('_')\n", " county = file_parts[0]\n", " year = file_parts[1]\n", @@ -448,7 +443,9 @@ " county = county.lower()\n", " # Convert year to int\n", " year = int(year)\n", - " # Store outputs\n", + " \n", + " \n", + " ### Store outputs in an aggregation variable\n", " county_list.append(county)\n", " year_list.append(year)\n", "# Check outputs\n", @@ -460,7 +457,10 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "What happened? How do we fix it? When we run the code on the whole loop, do you notice anything about the other county names? What might we want to change?\n", + "What happened? How do we fix it?\n", + "\n", + "\n", + "When we run the code on the whole loop, do you notice anything about the other county names? What might we want to change?\n", "\n", "Once the full code works, we can do the final steps: convert the output to a DataFrame and place everything into a function." ] @@ -473,7 +473,10 @@ "source": [ "import pandas as pd\n", "\n", - "def parse_files(filelist):\n", + "def parse_files(filelist): #function header\n", + " \n", + " \n", + " ##all of this is the same\n", " county_list = []\n", " year_list = []\n", " for file in files:\n", @@ -487,7 +490,8 @@ " # Store outputs\n", " county_list.append(county)\n", " year_list.append(year)\n", - "\n", + " \n", + " ##add a DataFrame and return statement\n", " df = pd.DataFrame({'county': county_list,\n", " 'year': year_list})\n", " return df" @@ -499,6 +503,8 @@ "metadata": {}, "outputs": [], "source": [ + "##Test it on all files\n", + "\n", "files = ['Alameda_2020.csv',\n", " 'Marin_2020.csv',\n", " 'Contra Costa_2020.csv',\n", @@ -514,7 +520,7 @@ "source": [ "## Challenge 4: Advanced Conversion Function\n", "\n", - "Let's take our conversion function from before make it more flexible.\n", + "Now you will get a chance to practice development function. Let's take our foot-to-meters conversion function from before make it more flexible.\n", "\n", "Let's say we want to convert from feet to other units as well. Change the original conversion function to a more generalized version `convert_from_feet(x,unit='meters')` by adding a keyword argument `unit` that defaults to meters. Use if statements within the body of the function to do the appropriate conversion based on this keyword argument and return the value. Choose two additional units to meters (such as inches, miles, or centimeters), and add them to the conversion function.\n", "\n", @@ -565,7 +571,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.12" + "version": "3.9.12" } }, "nbformat": 4, diff --git a/lessons/Part3/12_pandas.ipynb b/lessons/Part3/12_pandas.ipynb index 00f0fc6..7ece735 100644 --- a/lessons/Part3/12_pandas.ipynb +++ b/lessons/Part3/12_pandas.ipynb @@ -411,7 +411,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.12" + "version": "3.9.12" } }, "nbformat": 4, diff --git a/solutions/06_data_structures.ipynb b/solutions/06_data_structures.ipynb deleted file mode 100644 index 17c9a94..0000000 --- a/solutions/06_data_structures.ipynb +++ /dev/null @@ -1,250 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Challenge 1: Slicing Lists\n", - "Using the lists in the next cell:\n", - "\n", - "1. What does `thing[start:stop]` do? What is the output?\n", - "2. Write three different ways to slice the string from 'elephant' to the end." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", - "start = 2\n", - "stop = 5" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[8, 'elephant', 'banana']" - ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "#1. \n", - "thing[2:5] " - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "['elephant', 'banana', 2]\n", - "['elephant', 'banana', 2]\n", - "['elephant', 'banana', 2]\n" - ] - } - ], - "source": [ - "#2. \n", - "print(thing[-3:])\n", - "print(thing[3:])\n", - "print(thing[3:6])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Challenge 2: Appending to Lists\n", - "\n", - "We've created a list called `thing` in the cell below.\n", - "\n", - "1. Append the following values to the list, individually: `'apple'`, `8`, and `9`. Print the ensuing list out.\n", - "2. Make a new list called `thing2` consisting of the values `'apple'`, `8`, and `9`. Append `thing2` to `thing`. How does the output differ from the output from the previous part?\n", - "3. Look at the [documentation](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) for the list method `.extend()`. Is there a way to rewrite your answer to (2) to use extend? How does that compare to the outputs of the previous two parts?\n", - "4. What is one situation where you would use `append` and one where you would use `extend`?\n", - "\n", - "**Hint**: *Iterable* in Python means an object with multiple values that can be iterated through (including lists, tuples, and even strings)." - ] - }, - { - "cell_type": "code", - "execution_count": 21, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[1, 3, 8, 'elephant', 'banana', 2, 'apple', 8, 9]\n", - "[1, 3, 8, 'elephant', 'banana', 2, ['apple', 8, 9]]\n", - "[1, 3, 8, 'elephant', 'banana', 2, 'apple', 8, 9]\n" - ] - } - ], - "source": [ - "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", - "\n", - "# 1\n", - "thing.append('apple')\n", - "thing.append(8)\n", - "thing.append(9)\n", - "print(thing)\n", - "\n", - "# 2\n", - "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", - "\n", - "thing2 = ['apple',8,9]\n", - "thing.append(thing2)\n", - "print(thing)\n", - "\n", - "#3 \n", - "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", - "thing2 = ['apple',8,9]\n", - "thing.extend(thing2)\n", - "print(thing)\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Challenge 3: Creating a Dictionary\n", - "\n", - "Create a dictionary `fruits` with the following lists. Print the list of keys in the dictionary." - ] - }, - { - "cell_type": "code", - "execution_count": 23, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{'fruit': ['apple', 'orange', 'mango'],\n", - " 'length': [3.2, 2.1, 3.1],\n", - " 'color': ['red', 'orange', 'yellow']}" - ] - }, - "execution_count": 23, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "fruit = ['apple', 'orange', 'mango']\n", - "length = [3.2, 2.1, 3.1]\n", - "color = ['red', 'orange', 'yellow']\n", - "\n", - "\n", - "fruits_dict= {'fruit':fruit,\n", - " 'length':length,\n", - " 'color':color}\n", - "fruits_dict" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Challenge 4: Initializing a DataFrame\n", - "\n", - "The following code gives a couple of errors. What are the errors? What are some ways we could fix this?" - ] - }, - { - "cell_type": "code", - "execution_count": 31, - "metadata": {}, - "outputs": [ - { - "ename": "TypeError", - "evalue": "unhashable type: 'list'", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 11\u001b[0m \u001b[0mfruit\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mfruit\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 12\u001b[0m \u001b[0mlength\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mlength\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 13\u001b[0;31m color: color}\n\u001b[0m\u001b[1;32m 14\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 15\u001b[0m \u001b[0mdf\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpd\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mDataFrame\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfruits\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mTypeError\u001b[0m: unhashable type: 'list'" - ] - } - ], - "source": [ - "import pandas as pd\n", - "\n", - "#Error 1: the keys are lists, not strings\n", - "#Error 2: The fruit list is shorter than the other two, so the data are not rectangular. \n", - "# Add a filler to the fruit list to make sure it is the same length\n", - "fruit = ['apple', 'orange']\n", - "length = [3.2, 2.1, 3.1]\n", - "color = ['red', 'orange', 'yellow']\n", - "\n", - "fruits = {\n", - " fruit: fruit,\n", - " length: length,\n", - " color: color}\n", - "\n", - "df = pd.DataFrame(fruits)" - ] - }, - { - "cell_type": "code", - "execution_count": 32, - "metadata": {}, - "outputs": [], - "source": [ - "fruit = ['apple', 'orange','unknown']\n", - "length = [3.2, 2.1, 3.1]\n", - "color = ['red', 'orange', 'yellow']\n", - "\n", - "fruits = {\n", - " 'fruit': fruit,\n", - " 'length': length,\n", - " 'color': color}\n", - "\n", - "df = pd.DataFrame(fruits)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/solutions/06_lists.ipynb b/solutions/06_lists.ipynb new file mode 100644 index 0000000..f1e619b --- /dev/null +++ b/solutions/06_lists.ipynb @@ -0,0 +1,113 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Challenge 1: Slicing Lists\n", + "Using the lists in the next cell:\n", + "\n", + "1. What does `thing[start:stop]` do? What is the output?\n", + "2. Write three different ways to slice the string from 'elephant' to the end." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", + "start = 2\n", + "stop = 5" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#1. \n", + "thing[2:5] " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#2. \n", + "print(thing[-3:])\n", + "print(thing[3:])\n", + "print(thing[3:6])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Challenge 2: Appending to Lists\n", + "\n", + "We've created a list called `thing` in the cell below.\n", + "\n", + "1. Append the following values to the list, individually: `'apple'`, `8`, and `9`. Print the ensuing list out.\n", + "2. Make a new list called `thing2` consisting of the values `'apple'`, `8`, and `9`. Append `thing2` to `thing`. How does the output differ from the output from the previous part?\n", + "3. Look at the [documentation](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) for the list method `.extend()`. Is there a way to rewrite your answer to (2) to use extend? How does that compare to the outputs of the previous two parts?\n", + "4. What is one situation where you would use `append` and one where you would use `extend`?\n", + "\n", + "**Hint**: *Iterable* in Python means an object with multiple values that can be iterated through (including lists, tuples, and even strings)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", + "\n", + "# 1\n", + "thing.append('apple')\n", + "thing.append(8)\n", + "thing.append(9)\n", + "print(thing)\n", + "\n", + "# 2\n", + "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", + "\n", + "thing2 = ['apple',8,9]\n", + "thing.append(thing2)\n", + "print(thing)\n", + "\n", + "#3 \n", + "thing = [1, 3, 8, 'elephant', 'banana', 2]\n", + "thing2 = ['apple',8,9]\n", + "thing.extend(thing2)\n", + "print(thing)\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.12" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/solutions/07_dictionaries_and_dataframes.ipynb b/solutions/07_dictionaries_and_dataframes.ipynb new file mode 100644 index 0000000..816f95c --- /dev/null +++ b/solutions/07_dictionaries_and_dataframes.ipynb @@ -0,0 +1,152 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Challenge 1: Creating a Dictionary\n", + "\n", + "Create a dictionary `fruits` with the following lists. Print the list of keys in the dictionary." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fruit = ['apple', 'orange', 'mango']\n", + "length = [3.2, 2.1, 3.1]\n", + "color = ['red', 'orange', 'yellow']\n", + "\n", + "\n", + "fruits_dict= {'fruit':fruit,\n", + " 'length':length,\n", + " 'color':color}\n", + "fruits_dict" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Challenge 2: Initializing a DataFrame\n", + "\n", + "The following code gives a couple of errors. What are the errors? What are some ways we could fix this?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import pandas as pd\n", + "\n", + "#Error 1: the keys are lists, not strings\n", + "#Error 2: The fruit list is shorter than the other two, so the data are not rectangular. \n", + "# Add a filler to the fruit list to make sure it is the same length\n", + "fruit = ['apple', 'orange']\n", + "length = [3.2, 2.1, 3.1]\n", + "color = ['red', 'orange', 'yellow']\n", + "\n", + "fruits = {\n", + " fruit: fruit,\n", + " length: length,\n", + " color: color}\n", + "\n", + "df = pd.DataFrame(fruits)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Solution:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fruit = ['apple', 'orange','unknown']\n", + "length = [3.2, 2.1, 3.1]\n", + "color = ['red', 'orange', 'yellow']\n", + "\n", + "fruits = {\n", + " 'fruit': fruit,\n", + " 'length': length,\n", + " 'color': color}\n", + "\n", + "df = pd.DataFrame(fruits)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Challenge 3: `value_counts()`\n", + "\n", + "There is another pandas function `.value_counts()` which can be used to help organize the information provided by both `unique()` and `nunique()`. Read the [documentation](https://pandas.pydata.org/docs/reference/api/pandas.Series.value_counts.html) and apply `value_counts()` to the `df` variable. How many 'red' and 'orange' fruits are in the DataFrame?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fruit = ['apple', 'orange', 'mango', 'strawberry', 'salmonberry', 'thimbleberry']\n", + "size = [3, 2, 3, 1, 1, 1]\n", + "color = ['red', 'orange', 'orange', 'red', 'orange', 'red']\n", + "\n", + "fruits = {\n", + " 'fruit': fruit,\n", + " 'size': size,\n", + " 'color': color}\n", + "\n", + "df = pd.DataFrame(fruits)\n", + "df" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df['color'].value_counts()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.12" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/solutions/07_loops.ipynb b/solutions/08_loops.ipynb similarity index 65% rename from solutions/07_loops.ipynb rename to solutions/08_loops.ipynb index 18fc4f5..27c291a 100644 --- a/solutions/07_loops.ipynb +++ b/solutions/08_loops.ipynb @@ -11,18 +11,9 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "ename": "SyntaxError", - "evalue": "invalid syntax (, line 1)", - "output_type": "error", - "traceback": [ - "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m for kitten in [2, 3, 5] #missing colon\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" - ] - } - ], + "outputs": [], "source": [ "for kitten in [2, 3, 5] #missing colon\n", "print(k) #not indented\n", @@ -32,19 +23,9 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "2\n", - "3\n", - "5\n" - ] - } - ], + "outputs": [], "source": [ "#solution\n", "for kitten in [2, 3, 5]:\n", @@ -73,21 +54,9 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "ename": "NameError", - "evalue": "name 'mountain_df' is not defined", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 17\u001b[0m )\n\u001b[1;32m 18\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 19\u001b[0;31m \u001b[0mmountain_df\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;31mNameError\u001b[0m: name 'mountain_df' is not defined" - ] - } - ], + "outputs": [], "source": [ "import pandas as pd\n", "\n", @@ -112,22 +81,9 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "4409.5199999999995\n", - "4371.215999999999\n", - "4332.608\n", - "4331.392\n", - "4310.416\n", - "4253.568\n" - ] - } - ], + "outputs": [], "source": [ "elevation = mountains_df['elevation']\n", "\n", @@ -148,17 +104,9 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "12\n" - ] - } - ], + "outputs": [], "source": [ "total = 0\n", "words = [\"red\", \"green\", \"blue\"]\n", @@ -178,17 +126,9 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[3, 5, 4]\n" - ] - } - ], + "outputs": [], "source": [ "lengths = []\n", "words = [\"red\", \"green\", \"blue\"]\n", @@ -208,17 +148,9 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "redgreenblue\n" - ] - } - ], + "outputs": [], "source": [ "words = [\"red\", \"green\", \"blue\"]\n", "result = ''\n", @@ -238,17 +170,9 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "RGB\n" - ] - } - ], + "outputs": [], "source": [ "words = [\"red\", \"green\", \"blue\"]\n", "result = ''\n", @@ -276,7 +200,7 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -301,31 +225,9 @@ }, { "cell_type": "code", - "execution_count": 16, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Difference between Mt. Whitney and Mt. Williamson is 126 feet\n", - "Difference between Mt. Whitney and White Mountain Peak is 253 feet\n", - "Difference between Mt. Whitney and North Palisade is 257 feet\n", - "Difference between Mt. Whitney and Mt. Shasta is 326 feet\n", - "Difference between Mt. Whitney and Mt. Humphreys is 513 feet\n", - "Difference between Mt. Williamson and White Mountain Peak is 127 feet\n", - "Difference between Mt. Williamson and North Palisade is 131 feet\n", - "Difference between Mt. Williamson and Mt. Shasta is 200 feet\n", - "Difference between Mt. Williamson and Mt. Humphreys is 387 feet\n", - "Difference between White Mountain Peak and North Palisade is 4 feet\n", - "Difference between White Mountain Peak and Mt. Shasta is 73 feet\n", - "Difference between White Mountain Peak and Mt. Humphreys is 260 feet\n", - "Difference between North Palisade and Mt. Shasta is 69 feet\n", - "Difference between North Palisade and Mt. Humphreys is 256 feet\n", - "Difference between Mt. Shasta and Mt. Humphreys is 187 feet\n" - ] - } - ], + "outputs": [], "source": [ "elevations = mountains_df['elevation']\n", "mountains = mountains_df['mountain']\n", @@ -349,7 +251,7 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3", + "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, @@ -363,7 +265,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.6" + "version": "3.9.12" } }, "nbformat": 4, diff --git a/solutions/08_conditionals.ipynb b/solutions/09_conditionals.ipynb similarity index 65% rename from solutions/08_conditionals.ipynb rename to solutions/09_conditionals.ipynb index 74ff0f3..41d4df9 100644 --- a/solutions/08_conditionals.ipynb +++ b/solutions/09_conditionals.ipynb @@ -13,25 +13,13 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "ename": "NameError", - "evalue": "name 'TRUE' is not defined", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mnumber_of_trees\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m14\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mnumber_of_shrubs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m8\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mhas_flowers\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mTRUE\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;31m# needs double ==\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mNameError\u001b[0m: name 'TRUE' is not defined" - ] - } - ], + "outputs": [], "source": [ "number_of_trees = 14\n", "number_of_shrubs = 8\n", - "has_flowers = TRUE #Needs to be Trrue\n", + "has_flowers = TRUE #Needs to be True\n", " \n", "# needs double == ; \n", "print((number_of_trees > 14) and (number_of_trees = number_of_shrubs) or not (has_flowers))" @@ -39,17 +27,9 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "False\n" - ] - } - ], + "outputs": [], "source": [ "number_of_trees = 14\n", "number_of_shrubs = 8\n", @@ -77,25 +57,9 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "80 is a B.\n", - "80 is a C.\n", - "85 is a B.\n", - "85 is a C.\n", - "99 is a B.\n", - "99 is a C.\n", - "75 is a C.\n", - "70 is a C.\n", - "68 is a D.\n" - ] - } - ], + "outputs": [], "source": [ "scores = [80, 85, 99, 75, 70, 68]\n", "\n", @@ -117,22 +81,9 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "80 is a B.\n", - "85 is a B.\n", - "99 is an A.\n", - "75 is a C.\n", - "70 is a C.\n", - "68 is a D.\n" - ] - } - ], + "outputs": [], "source": [ "## solution\n", "scores = [80, 85, 99, 75, 70, 68]\n", @@ -159,17 +110,9 @@ }, { "cell_type": "code", - "execution_count": 24, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[0, 0.2, 0.4, 0.0, 0, 0.4]\n" - ] - } - ], + "outputs": [], "source": [ "original = [-1.5, 0.2, 0.4, 0.0, -1.3, 0.4]\n", "result = []\n", @@ -184,7 +127,7 @@ }, { "cell_type": "code", - "execution_count": 19, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -239,48 +182,25 @@ }, { "cell_type": "code", - "execution_count": 20, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "['Martin Van Buren', 'James Buchanan', 'George H. W. Bush', 'George W. Bush', 'Joe Biden']\n" - ] - } - ], + "outputs": [], "source": [ "last_name_b = []\n", "for p in presidents:\n", - " if p.split(' ')[-1][0] == 'B':\n", + " if p.split(' ')[0][0] == 'J':\n", " last_name_b.append(p)\n", "print(last_name_b)" ] }, { "cell_type": "code", - "execution_count": 23, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "['Martin Van Buren',\n", - " 'James Buchanan',\n", - " 'George H. W. Bush',\n", - " 'George W. Bush',\n", - " 'Joe Biden']" - ] - }, - "execution_count": 23, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "# BONUS: Another method is list comprehension\n", - "[p for p in presidents if p.split(' ')[-1][0] == 'B']" + "[p for p in presidents if p.split(' ')[0][0] == 'J']" ] }, { @@ -293,7 +213,7 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3", + "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, @@ -307,7 +227,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.6" + "version": "3.9.12" } }, "nbformat": 4, diff --git a/solutions/09_custom_functions.ipynb b/solutions/09_custom_functions.ipynb deleted file mode 100644 index 3ce7cb3..0000000 --- a/solutions/09_custom_functions.ipynb +++ /dev/null @@ -1,266 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Challenge 1: My First Function\n", - "\n", - "Write a function that converts Celsius temperatures to Fahrenheit. The formula for this conversion is:\n", - "\n", - "$$F = 1.8 * C + 32$$" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "def c_to_f(temp):\n", - " return(temp * 1.8 +32)" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "32.0" - ] - }, - "execution_count": 7, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "c_to_f(0)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Challenge 2: Calling Keyword Arguments\n", - "\n", - "Call the following function using the keyword arguments. Specifically, print the date corresponding to January 1, 2003." - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [], - "source": [ - "def print_date(year, month, day):\n", - " joined = str(year) + '/' + str(month) + '/' + str(day)\n", - " print(joined)" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Jan/1/2003\n" - ] - } - ], - "source": [ - "print_date('Jan',1,2003)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Challenge 3: More Errors!\n", - "\n", - "Why do the following lines return errors?\n", - "\n", - "**Hint**: Think about what happens inside the function, and how the arguments fit into the function." - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [], - "source": [ - "def divide(x, y=10):\n", - " return(x / y)" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": { - "scrolled": true - }, - "outputs": [ - { - "ename": "SyntaxError", - "evalue": "positional argument follows keyword argument (, line 2)", - "output_type": "error", - "traceback": [ - "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m2\u001b[0m\n\u001b[0;31m divide(y=10, 4)\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m positional argument follows keyword argument\n" - ] - } - ], - "source": [ - "#Keyword arguments must follow positional (non-keyword) arguments\n", - "divide(y=10, 4) " - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "0.4" - ] - }, - "execution_count": 14, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# solution\n", - "divide(4,y=10)" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": { - "scrolled": true - }, - "outputs": [ - { - "ename": "TypeError", - "evalue": "unsupported operand type(s) for /: 'int' and 'str'", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m#Wrong type for the argument\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mdivide\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'10'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;32m\u001b[0m in \u001b[0;36mdivide\u001b[0;34m(x, y)\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mdivide\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0;32mreturn\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m \u001b[0;34m/\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;31mTypeError\u001b[0m: unsupported operand type(s) for /: 'int' and 'str'" - ] - } - ], - "source": [ - "#Wrong type for the argument. Convert the string to a number\n", - "divide(4, y='10')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#solution\n", - "divide(4, y=int('10'))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Challenge 4: Advanced Conversion Function\n", - "\n", - "Let's take our conversion function from before make it more flexible.\n", - "\n", - "Let's say we want to convert from feet to other units as well. Change the original conversion function to a more generalized version `convert_from_feet(x,unit='meters')` by adding a keyword argument `unit` that defaults to meters. Use if statements within the body of the function to do the appropriate conversion based on this keyword argument and return the value. Choose two additional units to meters (such as inches, miles, or centimeters), and add them to the conversion function.\n", - "\n", - "\n", - "\n", - "Follow the steps:\n", - "\n", - "1. Plan your function. \n", - "2. Write your function. \n", - "3. Test the function.\n", - "\n", - "**Bonus**: What if you wanted to convert several values at once? What if you want to convert to several other units at once?" - ] - }, - { - "cell_type": "code", - "execution_count": 22, - "metadata": {}, - "outputs": [], - "source": [ - "def convert_from_feet(x,unit='meters'):\n", - " if unit=='meters':\n", - " return(x*.308)\n", - " elif unit == 'inches':\n", - " return(x*12)\n", - " elif unit == 'yards':\n", - " return(x*3)\n", - " else:\n", - " return('unit not found')" - ] - }, - { - "cell_type": "code", - "execution_count": 23, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "72" - ] - }, - "execution_count": 23, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "convert_from_feet(6,unit='inches')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/solutions/10_custom_functions.ipynb b/solutions/10_custom_functions.ipynb new file mode 100644 index 0000000..9ef2077 --- /dev/null +++ b/solutions/10_custom_functions.ipynb @@ -0,0 +1,193 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Challenge 1: My First Function\n", + "\n", + "Write a function that converts Celsius temperatures to Fahrenheit. The formula for this conversion is:\n", + "\n", + "$$F = 1.8 * C + 32$$" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def c_to_f(temp):\n", + " return(temp * 1.8 +32)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "c_to_f(0)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Challenge 2: More Errors!\n", + "\n", + "Why do the following lines return errors?\n", + "\n", + "**Hint**: Think about what happens inside the function, and how the arguments fit into the function." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def divide(x, y=10):\n", + " return(x / y)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "#Keyword arguments must follow positional (non-keyword) arguments\n", + "divide(z=False,10, 4)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# solution\n", + "divide(10, 4,False)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "#Wrong type for the argument. Convert the string to a number\n", + "divide(4, y='10')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#solution\n", + "divide(4, y=int('10'))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "divide(4) #needs another argument" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#solution\n", + "divide(4,10)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Challenge 3: Advanced Conversion Function\n", + "\n", + "Let's take our conversion function from before make it more flexible.\n", + "\n", + "Let's say we want to convert from feet to other units as well. Change the original conversion function to a more generalized version `convert_from_feet(x,unit='meters')` by adding a keyword argument `unit` that defaults to meters. Use if statements within the body of the function to do the appropriate conversion based on this keyword argument and return the value. Choose two additional units to meters (such as inches, miles, or centimeters), and add them to the conversion function.\n", + "\n", + "\n", + "\n", + "Follow the steps:\n", + "\n", + "1. Plan your function. \n", + "2. Write your function. \n", + "3. Test the function.\n", + "\n", + "**Bonus**: What if you wanted to convert several values at once? What if you want to convert to several other units at once?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def convert_from_feet(x,unit='meters'):\n", + " if unit=='meters':\n", + " return(x*.308)\n", + " elif unit == 'inches':\n", + " return(x*12)\n", + " elif unit == 'yards':\n", + " return(x*3)\n", + " else:\n", + " return('unit not found')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "convert_from_feet(6,unit='inches')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.12" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}