<a href="https://colab.research.google.com/github/mggg/Training_Materials_25/blob/main/notebooks/practitioners/Wednesday/Tutorial_2_hello_world.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# A Very Brief Introduction to Python

In this notebook, we will be looking to introduce to you the very basics of using Python. This tutorial assumes that you have know nothing of Python or programming, so we will start off with the most basic concepts. As note, this tutorial is NOT meant teach you how to code well, but rather to provide
you with the basic knowledge needed to understand the code written for the upcoming training sessions. This notebook will also be a little verbose so that it may serve as a reference as the training progresses.


### Objectives

By the end of this notebook, you should be able to:

* Manipulate basic Python data types and their arrangement into data structures
  - data types include: integers, booleans, floats, and strings
  - data structures made out of those include: lists (including arrays) and dictionaries (including dataframes)
* Understand the basic syntax of a function
* Be able to use a `for` loop
* Be able to open a file in Google Colab

## Python Types and Variables

There are 4 basic data types in Python:

* Integers
* Floats (real numbers)
* Booleans
* Strings


All of these basic types can be stored in a structure called a variable using the syntax:

```python
variable_name = value
```

For example, we can create a variable called `x` and assign it the value of `9` by using the following syntax:

In [1]:
x = 9

Now I can see that the value of "9" has been stored into the variable "x" by printing it out

In [2]:
print(x)

9


If we are in a jupyter notebook (like this one), we can also just put the variable at the end of the
cell and it will be printed out for us.

In [3]:
x

9

In the above example, we have assigned the variable `x` to the __*integer*__ value `9`. We can assign
more variables in the same way and then do basic arithmetic operations with them.

In [4]:
y = 2

x + y

11

In [5]:
y - x

-7

In [6]:
# Any text after the '#' character is called a comment. This is not a part of your code, but
# it can be useful to remind yourself what you were doing previously. When in doubt, add a comment!


# Since 2 does not divide 9 evenly, the result of this operation will be a real number
# In Python, a number with a decimal place is also called a FLOAT
x/y

4.5

In [8]:
# placeholder

print('the type of x is ',type(x),' and the type of x/y is ',type(x/y))

the type of x is  <class 'int'>  and the type of x/y is  <class 'float'>


This is a placeholder for text that says "Now I'm gonna tell you about operations"

In [9]:
# This operation is called "modulus" and it just gives you the remainder of the division
# 9/2 has a remainder of 1, so this will return 1
x % y

1

In [10]:
# This operation is something special in Python, it is called "floor division"
# and it just gives you the whole number part of the division

# 9/2 = 4.5, so this will return 4
x // y

4

Lastly, it is sometimes useful to save the results of a computation so that we can use it later.
This can be done quite easily by making another variable

In [11]:
# Notice: variable names can be anything you want. The only restrictions are that they cannot start
# with a number or contain special characters like ' ' (space), '@', '#', etc.

# Generally, it is advised that your variable names are descriptive and lowercase, with words
# separated by underscores.
my_result = x + y

Note: You cannot work with variables that have not been defined earlier in your code.

In [12]:
a + b  # Python does not know what a and b are yet, so this will give an error
a = 1
b = 2

NameError: name 'a' is not defined

### Strings

Strings are generally just little bits of text. We generally like to use them so that we can
make more descriptive print statements. They are also needed to make things like file names
or dictionaries (we'll talk about dictionaries later).

In Python, you can actually put a `+` sign between two strings to concatenate them.

In [13]:
# Strings can be surrounded by either single or double quotes.
string_1 = "Goodbye, Mars!"
string_2 = ' Hello, World!'
string_1 + string_2

'Goodbye, Mars! Hello, World!'

In [14]:
# Numbers and strings cannot be added together, so this will give an error

x + string_1

TypeError: unsupported operand type(s) for +: 'int' and 'str'

In [16]:
# Formatted strings are preceded by the letter 'f' and allow you to insert variables into the string
# We really like this feature for debugging.

my_formatted_string = f"The value of x is '{x}' and the value of y is '{y}'."
print(my_formatted_string) # "print" is a printing function that lets us see strings in the console

The value of x is '9' and the value of y is '2'.


You may also see strings preceeded by a `b` or an `r`, or strings in triple quotes. We will not
worry about these for now.

### Functions

For now, it is useful to think about functions just as a nice way of putting together a bunch
of code so that you don't have to doing the "copy and paste" thing all the time.

Terminology:

* The **parameters**/**arguments** of a function are the things that you put in the parentheses when
  you define or call the function.
* The **body** of a function is the code that is executed when you call the function.
* The **return value** of a function is what the function hands back to you when you call it. Return
  values can then be stored in variables and used later.

In [29]:
def make_greeting(name):
    return f"Hello, {name}! Is there anything I can help you with?"


print(make_greeting("Alice"))
print(make_greeting("Bob"))

Hello, Alice! Is there anything I can help you with?
Hello, Bob! Is there anything I can help you with?


In [30]:
def is_pythagorean_triplet(a, b, c):
    # Adding some print statements can help us tell what is going on inside the function
    # In case we get unexpected results
    print(f"Checking if {a}, {b}, and {c} form a Pythagorean triplet.")
    print(f"\tComputed a^2 + b^2  = {a**2 + b**2}, and c^2 = {c**2}")
    return a**2 + b**2 == c**2  # Note that a**2 means "a to the power of 2" in Python


# Here we use the print function to see the results of our function calls
print(is_pythagorean_triplet(3, 4, 5))  # This should return True
print(is_pythagorean_triplet(1, 2, 3))   # This should return False

Checking if 3, 4, and 5 form a Pythagorean triplet.
	Computed a^2 + b^2  = 25, and c^2 = 25
True
Checking if 1, 2, and 3 form a Pythagorean triplet.
	Computed a^2 + b^2  = 5, and c^2 = 9
False


### Equality and Booleans

We saw that the `=` operation is used for assignment, so how do we check if two things are equal?
In Python, we represent "equality" with the `==` operator.

In [19]:
# This will print out "False" since x has been set to 9 and y to 2
x == y

False

In [20]:
# This will print out "True"
x == 9

True

The values `False` and `True` are reserved words in Python meant to represent the __*boolean*__
data type. The boolean data type can only take on two values: `True` or `False`
and is generally used to check whether a condition is `True` or `False`.

In [21]:
# Below is an example of an "if" statement, which is used to make decisions in your code.
# If the condition is true, the code inside the nested block will be executed.

if x == 8:
    print("x is equal to 8")        # This code will be executed if the condition evalutes to "True"
else:
    print("x is not equal to 8")    # This code will be executed if the condition evalutes to "False"

x is not equal to 8


In [22]:
if x == 8:
    print("x is 8")     # This print statement will only execute if x is equal to 8
elif x == 9:
    print("x is 9")    # This print statement will only execute if x is both not equal to 8 and equal to 9
else:
    print("x is neither 8 nor 9")  # This will execute if neither of the previous conditions were true

x is 9


In [23]:
if x == 8:
    print("x is 8")     # This print statement will only execute if x is equal to 8
elif x == 10:
    print("x is 10")    # This print statement will only execute if x is both not equal to 8 and equal to 10
else:
    print("x is neither 8 nor 10")  # This will execute if neither of the previous conditions were true

x is neither 8 nor 10


⚠️ AI WARNING ⚠️

(some more text about AI and coding)

You might see some code that looks like this:

```python
if value is None:
    # do something
```

The word `is` here is another keyword in Python that can, in VERY SPECIFIC SITUATIONS, be used to
check if two things are the equal. We recommend you to never use this keyword unless you gain
experience in a lower level language like C or C++ where you woul have to learn what pointers are.

In every place that you are likely to be able to use the word `is` in Python, you can just as well
use the `==` operator.

### Loops and Lists

Generally, there are two different types of loops in Python: "for" loops and "while" loops. We will 
only cover "for" loops here since they are all you will likely need for most purposes.

The main idea behind a "for" loop is that you specify a range of numbers, and then for each number in
that range, you perform some operation. For example, if you wanted to print the numbers from 1 to 10,
you could do the following:

In [24]:
for i in range(10):
    print(f"Current value of i is {i}")

Current value of i is 0
Current value of i is 1
Current value of i is 2
Current value of i is 3
Current value of i is 4
Current value of i is 5
Current value of i is 6
Current value of i is 7
Current value of i is 8
Current value of i is 9


Wait a second... why did this start at 0 and go to 9 rather than running from 1 to 10? Well, this is
because python, like most programing languages, is __zero-indexed__. This makes more sense when we
in the context of __lists__.

Lists are exactly what they sound like: they are just a collection of a bunch of things in some order.
For example, we can make a list of the numbers from 0 to 9:

In [25]:
my_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print(my_list)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


Note:

Lists in pythons *can* contain items of multiple types, but it's generally not a good idea to do this.

In [26]:
# Don't do this if you can avoid it. It's much better to know the type of every item in your list.
my_list_with_multiple_types = [0, 1, "two", 3.0, True, 100]
print(my_list_with_multiple_types)

[0, 1, 'two', 3.0, True, 100]


And one of the nicest things about lists is that we can iterate over them

In [41]:
# Here we iterate over a list of strings
name_list = ["Alice", "Bob", "Charlie"]
for name in name_list:
    print(f"Hello, {name}!")

Hello, Alice!
Hello, Bob!
Hello, Charlie!


We can also access data that is stored in the in a list using square brackets

In [28]:
print(f"The first name in the list is: {name_list[0]}")

The first name in the list is: Alice


So, an equivalent expression to

```python
for name in name_list:
    print(name)
```

in this case would be

In [42]:
for i in range(3):
    print("Hello,", name_list[i]+"!")  # This will print the first three names in the list

Hello, Alice!
Hello, Bob!
Hello, Charlie!


In [43]:
# You can also edit (i.e., overwrite) items in a list by using their index

name_list[1] = "Maxine"
print(name_list)

['Alice', 'Maxine', 'Charlie']


#### Other useful list functions

In python, data structures like lists are pretty nice because they come with a bunch of built-in
functions that make them very easy to work with.  For example, we can use the `append` function
to add an element to the end of a list:

In [44]:
my_list = [1, 2, 3, 4, 5]
my_list.append(-1)
print(my_list)

[1, 2, 3, 4, 5, -1]


We can determine the number of items in the list using the `len()` function.

In [46]:
print(f"The list stored in the variable `my_list` has {len(my_list)} elements.")

The list stored in the variable `my_list` has 6 elements.


We can make a new list of sorted items by using the `sorted()` function.

In [47]:
print(sorted(my_list))

[-1, 1, 2, 3, 4, 5]


⚠️ WARNING ⚠️

Lists of different types can cause very strange behaviour in Python.


In [48]:
print(my_list_with_multiple_types)
sorted(my_list_with_multiple_types)  # This will give an error because you cannot compare strings and numbers

[0, 1, 'two', 3.0, True, 100]


TypeError: '<' not supported between instances of 'str' and 'int'

In [49]:
# Having different types can work out, but in confusing ways...
sorted([0,2,3,True,False,0.5])

[0, False, 0.5, True, 2, 3]

⚠️ WARNING ⚠️

A common mistake is trying to edit a list as you iterate over it.

The following text is commented out because if you ran it, it would cause an infinite loop.  Once you're in that you'll have to interrupt execution of the cell to get out.... so try this with caution.

In [50]:
# new_list = [1, 2, 3, 4, 5]
# for i in new_list:
#     new_list.append(i + 5)  # This will cause an infinite loop, so be careful with this!

### Dictionaries

(a little more warmup)

Python dictionaries (also called hash tables or maps) are a data structure that function a lot like
lists, but rather than being indexed by integers, you can make unique keys for each item. This can
be really useful for categorizing data into different groups.

In [51]:
my_dictionary = {
    "Trump": "Republican",
    "Biden": "Democrat",
    "Harris": "Democrat",
    "Sinema": "Democrat",
    "Cruz": "Republican"
}

Like lists, dictionaries are mutable meaning that you can edit them by adding or changing elements:

In [52]:
my_dictionary["Sinema"] = "Independent"
my_dictionary["Sanders"] = "Independent"

my_dictionary # Remember that we can print by putting the variable name at the end of the cell

{'Trump': 'Republican',
 'Biden': 'Democrat',
 'Harris': 'Democrat',
 'Sinema': 'Independent',
 'Cruz': 'Republican',
 'Sanders': 'Independent'}

We can also iterate over the items in a dictionary using one of the following methods:

In [53]:
for key in my_dictionary.keys():
    print(key)

print("-----------------------------")
for value in my_dictionary.values():
    print(value)

print("-----------------------------")
for pair in my_dictionary.items():
    print(pair)

Trump
Biden
Harris
Sinema
Cruz
Sanders
-----------------------------
Republican
Democrat
Democrat
Independent
Republican
Independent
-----------------------------
('Trump', 'Republican')
('Biden', 'Democrat')
('Harris', 'Democrat')
('Sinema', 'Independent')
('Cruz', 'Republican')
('Sanders', 'Independent')


## Libraries

So, one of the nicest things about Python is that there are a ton of libraries out there that you can use to make your life easier. The main libraries that we will want to be aware of in this training are:

- NumPy (fast library for math operations)
- Pandas (for dataframes, which are like spreadsheets)
- GeoPandas (for dataframes that have geospatial information)
- VoteKit (for elections!)


However, in order to use these libraries, we need to install them. In Colab, you can simply make a new cell and type in the following:

```console
!pip install <library_name>
```

and the python package manager PIP will install the library for you. (The '!' at the beginning of the cell is used for certain command-line functions, like `pip` -- used for installing and removing Python packges -- or `curl` -- used for downloading files from the internet)

In [54]:
!pip install numpy



We can now use the library by importing it into our notebook with the syntax:

```python
import numpy
```


#### NumPy

NumPy is a powerful mathematical computing library that is mainly used for manipulating arrays
(think lists) of data. Most importantly, it attached a bunch of useful functions to arrays,
and allows us to do vector computations easily.

In [55]:
# Adding one to every element in a list using python alone
my_python_list = [1, 2, 3, 4, 5]
for i in range(len(my_python_list)):
    new_value = my_python_list[i] + 1
    my_python_list[i] = new_value


print(my_python_list)

[2, 3, 4, 5, 6]


In [56]:
# The "import a as b" syntax allows you to import a module called "a" and refer to it as "b" in your
# code. This is really useful if you have a long module name or if you want to avoid name conflicts.
import numpy as np


my_numpy_array = np.array([1, 2, 3, 4, 5])
new_numpy_array = my_numpy_array + 1
new_numpy_array

array([2, 3, 4, 5, 6])

In [57]:
# Quick note: if you want to avoid juggling the "new_numpy_array" variable, you can just do this:

my_numpy_array = np.array([1, 2, 3, 4, 5])
my_numpy_array = my_numpy_array + 1
my_numpy_array

array([2, 3, 4, 5, 6])

In [58]:
# You will also see the syntax
# my_numpy_array = my_numpy_array + 1
# shortened to
# my_numpy_array += 1
# which is just a shorthand for the same operation.


my_numpy_array = np.array([1,2,3,4,5])
my_numpy_array += 1
my_numpy_array

array([2, 3, 4, 5, 6])

In [59]:
# Note: you can still append to numpy arrays, but it is a little trickier than with lists.

my_numpy_array = np.append(my_numpy_array, [8,8,8])

So arithmetic operations with NumPy arrays are much easier to write, and, as a bonus, they are
generally faster than the 'for' loop methods. Here are some other quality-of-life functions:

In [61]:
print(f"The sum of the array {my_numpy_array} is {my_numpy_array.sum()}")
print()
print(f"The mean of the array {my_numpy_array} is {my_numpy_array.mean()}")
print()
print(f"The standard deviation of the array {my_numpy_array} is {my_numpy_array.std()}")
print()
print(f"The maximum value in the array {my_numpy_array} is {my_numpy_array.max()}")
print()
print(f"The minimum value in the array {my_numpy_array} is {my_numpy_array.min()}")
print()
print(f"The median value in the array {my_numpy_array} is {np.median(my_numpy_array)}")
print()
print(f"The unique elements of the array {my_numpy_array} are {np.unique(my_numpy_array)}")
print(f"\t And you can also count how many times each unique element appears: {np.unique(my_numpy_array, return_counts=True)}")

The sum of the array [2 3 4 5 6 8 8 8] is 44

The mean of the array [2 3 4 5 6 8 8 8] is 5.5

The standard deviation of the array [2 3 4 5 6 8 8 8] is 2.23606797749979

The maximum value in the array [2 3 4 5 6 8 8 8] is 8

The minimum value in the array [2 3 4 5 6 8 8 8] is 2

The median value in the array [2 3 4 5 6 8 8 8] is 5.5

The unique elements of the array [2 3 4 5 6 8 8 8] are [2 3 4 5 6 8]
	 And you can also count how many times each unique element appears: (array([2, 3, 4, 5, 6, 8]), array([1, 1, 1, 1, 1, 3]))


#### Pandas and GeoPandas

The Pandas and GeoPandas libraries in python are the main way that users interact with spreadsheet-like data in python. Let's take a look at how to work this this data. Please navigate to the directory at

[https://github.com/mggg/Training_Materials_25/tree/main/data](https://github.com/mggg/Training_Materials_25/tree/main/data)

download the files onto your computer, and then move them to your colab notebook or to your working directory.

In [62]:
!pip install pandas
!pip install geopandas



In [64]:
import pandas as pd

my_dataframe = pd.read_csv("county_population.csv")
my_dataframe

Unnamed: 0,GEOID20,tot_pop_20,bpop_20,hpop_20,asian_nhpi_pop_20,amin_pop_20,other_pop_20,white_pop_20,STATEFP,STATE
0,1001,58805,12266,1960,1298,1162,537,41582,1,AL
1,1003,231767,20913,12219,3476,5970,2694,186495,1,AL
2,1005,25223,12261,1394,134,237,111,11086,1,AL
3,1007,22293,4643,695,50,339,124,16442,1,AL
4,1009,59134,1250,5732,359,1514,515,49764,1,AL
...,...,...,...,...,...,...,...,...,...,...
3215,72145,54414,9043,45016,22,17,37,279,72,PR
3216,72147,8249,2147,5487,3,19,60,533,72,PR
3217,72149,22093,3538,18474,2,5,9,65,72,PR
3218,72151,30426,6124,24147,11,2,27,115,72,PR


In [65]:
my_dataframe = pd.read_excel("county_population.xlsx", sheet_name="US County Population")
my_dataframe

Unnamed: 0,GEOID20,tot_pop_20,bpop_20,hpop_20,asian_nhpi_pop_20,amin_pop_20,other_pop_20,white_pop_20,STATEFP,STATE
0,1001,58805,12266,1960,1298,1162,537,41582,1,AL
1,1003,231767,20913,12219,3476,5970,2694,186495,1,AL
2,1005,25223,12261,1394,134,237,111,11086,1,AL
3,1007,22293,4643,695,50,339,124,16442,1,AL
4,1009,59134,1250,5732,359,1514,515,49764,1,AL
...,...,...,...,...,...,...,...,...,...,...
3215,72145,54414,9043,45016,22,17,37,279,72,PR
3216,72147,8249,2147,5487,3,19,60,533,72,PR
3217,72149,22093,3538,18474,2,5,9,65,72,PR
3218,72151,30426,6124,24147,11,2,27,115,72,PR


In [66]:
my_dataframe = pd.read_parquet("county_population.parquet")
my_dataframe

Unnamed: 0,GEOID20,tot_pop_20,bpop_20,hpop_20,asian_nhpi_pop_20,amin_pop_20,other_pop_20,white_pop_20,STATEFP,STATE
0,01001,58805,12266,1960,1298,1162,537,41582,01,AL
1,01003,231767,20913,12219,3476,5970,2694,186495,01,AL
2,01005,25223,12261,1394,134,237,111,11086,01,AL
3,01007,22293,4643,695,50,339,124,16442,01,AL
4,01009,59134,1250,5732,359,1514,515,49764,01,AL
...,...,...,...,...,...,...,...,...,...,...
3215,72145,54414,9043,45016,22,17,37,279,72,PR
3216,72147,8249,2147,5487,3,19,60,533,72,PR
3217,72149,22093,3538,18474,2,5,9,65,72,PR
3218,72151,30426,6124,24147,11,2,27,115,72,PR


In [67]:
my_dataframe["STATE"]

Unnamed: 0,STATE
0,AL
1,AL
2,AL
3,AL
4,AL
...,...
3215,PR
3216,PR
3217,PR
3218,PR


In [68]:
# If you have ever worked with a database, this might look familiar.
my_dataframe.query("STATE == 'NY'")

Unnamed: 0,GEOID20,tot_pop_20,bpop_20,hpop_20,asian_nhpi_pop_20,amin_pop_20,other_pop_20,white_pop_20,STATEFP,STATE
1828,36001,314848,51493,17035,27292,3304,4829,210895,36,NY
1829,36003,46456,1147,914,590,775,703,42327,36,NY
1830,36005,1472654,580689,664646,74587,3956,17980,130796,36,NY
1831,36007,198683,17393,8191,10746,2957,3223,156173,36,NY
1832,36009,77042,1955,1465,805,4181,1082,67554,36,NY
...,...,...,...,...,...,...,...,...,...,...
1885,36115,61302,2196,1551,505,1358,1087,54605,36,NY
1886,36117,91283,4194,4100,799,1562,1398,79230,36,NY
1887,36119,1004457,166162,245884,75085,2851,16791,497684,36,NY
1888,36121,40531,2219,1287,337,605,504,35579,36,NY


In [69]:
my_dataframe[my_dataframe["STATE"] == "NY"]

Unnamed: 0,GEOID20,tot_pop_20,bpop_20,hpop_20,asian_nhpi_pop_20,amin_pop_20,other_pop_20,white_pop_20,STATEFP,STATE
1828,36001,314848,51493,17035,27292,3304,4829,210895,36,NY
1829,36003,46456,1147,914,590,775,703,42327,36,NY
1830,36005,1472654,580689,664646,74587,3956,17980,130796,36,NY
1831,36007,198683,17393,8191,10746,2957,3223,156173,36,NY
1832,36009,77042,1955,1465,805,4181,1082,67554,36,NY
...,...,...,...,...,...,...,...,...,...,...
1885,36115,61302,2196,1551,505,1358,1087,54605,36,NY
1886,36117,91283,4194,4100,799,1562,1398,79230,36,NY
1887,36119,1004457,166162,245884,75085,2851,16791,497684,36,NY
1888,36121,40531,2219,1287,337,605,504,35579,36,NY


In [70]:
my_new_dataframe = my_dataframe.query("STATE == 'NY' and bpop_20 > 100000")
my_new_dataframe

Unnamed: 0,GEOID20,tot_pop_20,bpop_20,hpop_20,asian_nhpi_pop_20,amin_pop_20,other_pop_20,white_pop_20,STATEFP,STATE
1830,36005,1472654,580689,664646,74587,3956,17980,130796,36,NY
1842,36029,954236,151856,50907,52472,11424,9341,678236,36,NY
1851,36047,2736074,874009,432653,408299,7299,45387,968427,36,NY
1855,36055,759443,137530,60936,38570,7040,9214,506153,36,NY
1857,36059,1395774,175649,241413,176009,4102,19147,779454,36,NY
1858,36061,1694251,285271,338271,246939,5049,25427,793294,36,NY
1868,36081,2405464,471487,617149,691071,12803,63596,549358,36,NY
1879,36103,1525920,140589,315750,75414,7968,18869,967330,36,NY
1887,36119,1004457,166162,245884,75085,2851,16791,497684,36,NY


In [71]:
# This uses the boolean operator '&' to combine two conditions
my_dataframe[(my_dataframe["STATE"] == "NY") & (my_dataframe["bpop_20"] > 100000)]

Unnamed: 0,GEOID20,tot_pop_20,bpop_20,hpop_20,asian_nhpi_pop_20,amin_pop_20,other_pop_20,white_pop_20,STATEFP,STATE
1830,36005,1472654,580689,664646,74587,3956,17980,130796,36,NY
1842,36029,954236,151856,50907,52472,11424,9341,678236,36,NY
1851,36047,2736074,874009,432653,408299,7299,45387,968427,36,NY
1855,36055,759443,137530,60936,38570,7040,9214,506153,36,NY
1857,36059,1395774,175649,241413,176009,4102,19147,779454,36,NY
1858,36061,1694251,285271,338271,246939,5049,25427,793294,36,NY
1868,36081,2405464,471487,617149,691071,12803,63596,549358,36,NY
1879,36103,1525920,140589,315750,75414,7968,18869,967330,36,NY
1887,36119,1004457,166162,245884,75085,2851,16791,497684,36,NY


In [72]:
my_new_dataframe.to_csv("my_new_dataframe.csv")