# Python Tutorial 01:

## Basic Troubleshooting and Usability Tips


___

__Author:__ _Sherman6_  
__Updated:__ _2020, February_

___


### Key Concepts Covered:  


* __[Selecting Items and Objects](#first-bullet)__
    * Selecting items and objects (primarily in lists and dataframes) - trouble-shooting and usability tips
    

* __[Selecting Items and Objects: Dataframes](#second-bullet)__
    * Selecting rows, columns, and cells in Pandas DataFrames, with troubleshooting  
        * __[Assigning values to a dataframe](#third-bullet)__  
        * __[Changing datatypes in a dataframe](#fourth-bullet)__  
        * __[Dataframes containing lists, and, lists containing dataframes](#fifth-bullet)__  


* __[Saving Items and Objects](#seventh-bullet)__
    * The difference between pointing and copying; tips for saving items in a physical location for future use


* __[Appendix](#appendix)__
    * References, More Resources, & Machine Information 
    
___ 

### Introduction:


Python's popularity continues to grow, with it becoming the #1 language preferred by those holding data science positions in 2019, according to Burtch Works [[1]](#reference).  This is not without good reason: Python is a great language for connecting disparate analytics tools to form a unified production pipeline, with benefits including ease of use, interpretability, an active user community, and the wide array of libraries available for various purposes, the quantity of which continues to grow [[2]](#reference).  And, of course, _it's free!_  


___

### Purpose:

#### In each of these tutorials, I will demonstrate a few tips and tricks which will hopefully help others troubleshoot code in Python.  

___


One of Python's benefits is that there are often many different ways to achieve something.  However, there are often _even more ways_ one can get stuck trying to figure it out.  

In my experience, seeing examples of __"the wrong way"__ to code is just as educational as seeing __"the right way"__ to code, so in several places I show examples of things that DON'T work, the error messages resulting from them, and why those messages appear. 

Furthermore, I've spent my fair share of hours [Googling](https://www.google.com), [Stack Overflow-ing](https://stackoverflow.com/), and [GitHub-ing](https://github.com/) to find solutions, as well as a decent amount of time on some very informative and robust free tutorial websites, such as [W3Schools](https://www.w3schools.com/), [TutorialsPoint](https://www.tutorialspoint.com/python/index.htm), and [R-Bloggers](https://www.r-bloggers.com/) (the first that come to mind).  
- While there are some great documentation and tutorials out there, I find myself __jumping from one to the next__, until I have __dozens of browser tabs open__, as each focuses _very in-depth_ on a specific command, library, function, or variable type.  


- That is why I'm designing these tutorials to be __broad, rather than deep__.  These tutorials are __not designed to be exhaustive__, but rather to help troubleshoot when you get stuck, and share a few helpful tips along the way. 



A few final comments:

- Minimal outside references for additional explanation or information are listed, throughout. 

- More advanced topics, such as in-depth examples of DataFrames, and specific nuances for neural network training, will be covered in separate workbooks.  

- I intend to add to this workbook as time goes on, to address recurring topics or popular questions.

- This workbook assumes at least basic familiarity with Python 3 and Jupyter Notebooks.


___

In [1]:
#Importing packages:

import numpy as np
import pandas as pd
import pickle
import random

___
<a class="anchor" id="first-bullet"></a>

## Selecting Items and Objects  

I have a confession to make:  Even though I enjoy the Python language, I waste _way_ too much time on this topic, in practice.   That is why I've made it the first section in this tutorial.  

Figuring out how to properly select or identify the item you desire is a fundamental skill in Python.  It is sometimes referred to as 'extracting' the values you need from within objects.   Following are some examples of how to select various types of objects in Python, as well as how NOT to select those objects.  

___


How to create a list:  Use single brackets to surround items separated by commas. 

In [2]:
list1 = ['Mother', "Father", 'daughter']

Showing the list:

In [3]:
list1

['Mother', 'Father', 'daughter']

What data type is that object?  (It's a 'list').


In [4]:
type(list1)

list

How many items are in the list?

In [5]:
len(list1)

3

How to select item 1 (the first item) in the list:  
- Remember, in Python, item indexes always start at 0 by default, with the second element at 1, third element at 2, et cetera (elements do not start at 1, like in R). 

In [6]:
list1[0]

'Mother'

How to select item 2 in list:

In [7]:
list1[1]

'Father'

How to select all items in list:

In [8]:
list1[:]

['Mother', 'Father', 'daughter']

___

How to select the last item in the list:

In [9]:
list1[-1:]

['daughter']

How to select the last 2 items in the list:

In [10]:
list1[-2:]

['Father', 'daughter']

What data type is the first item in the list?   
- (Each item has been automatically detected as a string, currently).


In [11]:
type(list1[0])

str

What data type is EACH item in the list?   
- (This is a simple 'for' loop, which cycles through each item and prints the data type). 

In [12]:
for i in range(len(list1)):
    print(type(list1[i]))

<class 'str'>
<class 'str'>
<class 'str'>


Let's say for some reason, the 'type' of some data is not what you want. 

In [13]:
list1 = ['2000', 1999, '4']
list1

['2000', 1999, '4']

In [14]:
for i in range(len(list1)):
    print(list1[i])
    print(type(list1[i]))
    print()

2000
<class 'str'>

1999
<class 'int'>

4
<class 'str'>



Certain items (the first and third) are strings, while others are integers (the second).   
This can create confusion when interpreting the data. 

___

__How to change data types (not an exhaustive list):__   


How to change a string into an integer:

In [15]:
list1[0] = int(list1[0])

In [16]:
type(list1[0])

int

How to change an integer into a string:

In [17]:
list1[1] = str(list1[1])

In [18]:
type(list1[1])

str

How to change into a floating point number (more robust than 'integer'):

In [19]:
list1[1] = float(list1[1])

In [20]:
type(list1[1])

float

The results:

In [21]:
for i in range(len(list1)):
    print(list1[i])
    print(type(list1[i]))
    print()

2000
<class 'int'>

1999.0
<class 'float'>

4
<class 'str'>



Iterating through the list, to make them uniform data type:

In [22]:
for i in range(len(list1)):
    list1[i] = int(list1[i])

The results:

In [23]:
for i in range(len(list1)):
    print(list1[i])
    print(type(list1[i]))
    print()

2000
<class 'int'>

1999
<class 'int'>

4
<class 'int'>



___
<a class="anchor" id="second-bullet"></a>

### Dataframes - Selecting Items and Objects

Pandas DataFrames have been ubiquitous in Python data analysis for several years now, and remain a popular framework for working with many types of data.   

For more information, I recommend going to the source, at [pandas.pydata.org](https://pandas.pydata.org/pandas-docs/stable/index.html#) [[3]](#reference).  

Additional resources are located in ['More Resources'](#reference) at the end of this workbook.

___

How to create a simple dataframe.
- Note, in this dataframe I specified column names, but they aren't required):

In [24]:
df = pd.DataFrame({'Article_ID': ['A12','A13'],
                        'number': [4, 7],
                        'sentence': ["A shares 4 friends with B.", "C gives D 7 apples."],
                        'entities': [['A','B'],['C','D']]})

In [25]:
df

Unnamed: 0,Article_ID,number,sentence,entities
0,A12,4,A shares 4 friends with B.,"[A, B]"
1,A13,7,C gives D 7 apples.,"[C, D]"


What data type is that object?  (It's a Pandas DataFrame).


In [26]:
type(df)

pandas.core.frame.DataFrame

What are the object's dimensions?  (Since it's a Pandas DataFrame, it's essentially a table, therefore you can expect to see two dimensions).


In [27]:
df.shape

(2, 4)

This dataframe is 2 rows long by 4 columns wide.  

Notice that the index starts at 0, then goes to 1.  This is not a 'column', but is rather the 'index' for the rows, and can be reset as needed.

___

#### Selecting a column from the dataframe:

How to select the column __as a series__. 

In [28]:
df['number']

0    4
1    7
Name: number, dtype: int64

In [29]:
type(df['number'])

pandas.core.series.Series

How to select the column __as a dataframe__. 

In [30]:
df[['number']]

Unnamed: 0,number
0,4
1,7


In [31]:
type(df[['number']])

pandas.core.frame.DataFrame

Notice that the result is itself also __a dataframe__.  
Here we see how adding double brackets allows it to stay in dataframe format, which can make things simpler.

___  


__Selecting a row from the dataframe:  Simple Method__

Select just the column headers.

In [32]:
df[:0]

Unnamed: 0,Article_ID,number,sentence,entities


Note that in this "simple method", `0` indicates the __header (column name) row, and not the first row of data__ (unlike normal Python indexing).
- To use a "more-familiar" form of indexing in dataframes (where 0 indicates the first data), use the `.loc[]` method (shown below). 

Select the first row of the dataframe. 

In [33]:
df[:1]

Unnamed: 0,Article_ID,number,sentence,entities
0,A12,4,A shares 4 friends with B.,"[A, B]"


Select the first two rows of the dataframe.

In [34]:
df[:2]

Unnamed: 0,Article_ID,number,sentence,entities
0,A12,4,A shares 4 friends with B.,"[A, B]"
1,A13,7,C gives D 7 apples.,"[C, D]"


Select only row 2 of the dataframe. 

In [35]:
df[1:2]

Unnamed: 0,Article_ID,number,sentence,entities
1,A13,7,C gives D 7 apples.,"[C, D]"


Because row 2 is within the span of row 1 and row 2, `1:2`, it is the only row returned, here. 

How to select the last row of the dataframe. 

In [36]:
df[-1:]

Unnamed: 0,Article_ID,number,sentence,entities
1,A13,7,C gives D 7 apples.,"[C, D]"


How to select the last two rows of the dataframe. 

In [37]:
df[-2:]

Unnamed: 0,Article_ID,number,sentence,entities
0,A12,4,A shares 4 friends with B.,"[A, B]"
1,A13,7,C gives D 7 apples.,"[C, D]"


___

Note, simply typing `df[2]` will result in errors. It will not give you a row.

In [38]:
df[1]

KeyError: 1

This `KeyError` is because the computer doesn't know what you are trying to specify (your command was not clear-enough).  A dataframe has 2 dimensions, not 1. 


___

__Select a row from the dataframe - another method:  `.loc[]`__

Another way to select SPECIFIC rows from a dataframe:  `.loc[]`

- I use this a lot; More than the simple, `[i-1:i]` method (for row "i").  

How to select the first row (index = 0) of the dataframe. 

In [39]:
df.loc[0]

Article_ID                           A12
number                                 4
sentence      A shares 4 friends with B.
entities                          [A, B]
Name: 0, dtype: object

Notice the result is __as a series__. 

In [40]:
type(df.loc[0])

pandas.core.series.Series

How to select the first row (index = 0) of the dataframe __as a DataFrame__. 

In [41]:
df.loc[[0]]

Unnamed: 0,Article_ID,number,sentence,entities
0,A12,4,A shares 4 friends with B.,"[A, B]"


Notice the row was selected __as a DataFrame__.  
Adding double brackets allows it to stay in dataframe format, which can make things simpler.

In [42]:
type(df.loc[[0]])

pandas.core.frame.DataFrame

For reference, select the __second__ row (index = 1) of the dataframe, as a Dataframe. 

In [43]:
df.loc[[1]]

Unnamed: 0,Article_ID,number,sentence,entities
1,A13,7,C gives D 7 apples.,"[C, D]"


___
__Select a row from the dataframe - another method (less-common):  `.iloc[]`__  

- This is less-common, but is another index-based lookup approach.  
- Note, `.loc[]` is more commonly used (and, note that .iloc cannot as easily specify an individual cell like .loc can).


Select row 1 of the dataframe.

In [44]:
df.iloc[[0]]

Unnamed: 0,Article_ID,number,sentence,entities
0,A12,4,A shares 4 friends with B.,"[A, B]"


Select row 1 of the dataframe.

In [45]:
df.iloc[0:1]

Unnamed: 0,Article_ID,number,sentence,entities
0,A12,4,A shares 4 friends with B.,"[A, B]"


Select rows 1-2 of the dataframe.

In [46]:
df.iloc[0:2]

Unnamed: 0,Article_ID,number,sentence,entities
0,A12,4,A shares 4 friends with B.,"[A, B]"
1,A13,7,C gives D 7 apples.,"[C, D]"


Niether method will error out if you select more rows than exist (they both work):

In [47]:
df.loc[1:99]

Unnamed: 0,Article_ID,number,sentence,entities
1,A13,7,C gives D 7 apples.,"[C, D]"


In [48]:
df.iloc[1:99]

Unnamed: 0,Article_ID,number,sentence,entities
1,A13,7,C gives D 7 apples.,"[C, D]"


(Both methods work).

___

### Selecting a cell from a dataframe:  

As a reminder, here is the dataframe we created earlier:

In [49]:
df = pd.DataFrame({'Article_ID': ['A12','A13'],
                        'number': [4, 7],
                        'sentence': ["A shares 4 friends with B.", "C gives D 7 apples."],
                        'entities': [['A','B'],['C','D']]})
df

Unnamed: 0,Article_ID,number,sentence,entities
0,A12,4,A shares 4 friends with B.,"[A, B]"
1,A13,7,C gives D 7 apples.,"[C, D]"


How to select the contents of the cell in column= 'number', row= 0 (the first row):

In [50]:
df['number'][0]

4

Another way:  __`.loc[]`__  

In [51]:
df.loc[0,'number']

4

Another way:  __`.at[]`__  

In [52]:
df.at[0,'number']

4

Note, use `.at` in certain situations where `.loc` doesn't work (for assigning values - see below). 

Also, you can combine both `.at` & `.loc`to select a cell:

In [53]:
df.loc[0].at['number']

4

___

As previously mentioned, `.iloc[]` cannot as easily specify an individual cell like `.loc[]` can.

Here, it throws an error if we try it in-place:


In [54]:
df.iloc[0,'number']

ValueError: Location based indexing can only have [integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array] types

This ValueError appears because we did not correctly specify the column.  

When using `iloc`, the code will be longer, but not much more complex.  If desiring to use `iloc`, we would have to write, `df.columns.get_loc('number')`, which obtains the column number.   `.iloc` only accepts the integer location of the column.  
If we knew the integer location of the column, we could just use it, as well:

In [56]:
df.columns.get_loc('number')

1

`'number'` is column '1'.

In [55]:
df.iloc[0,1]

4

This yields the same result, using `iloc[]`.

___
<a class="anchor" id="third-bullet"></a>

### Assigning values to a dataframe: When to use 'at', versus 'loc', versus [col][row]:  

Even though both `.at` and `.loc` can be used to _extract_ an item from a cell in a dataframe, there are certain situations where only `.at` can be used to __assign__ an item or items into that cell (when there is to be __more than one item__ in the cell).  

___

Generally, the `.loc` method works __best__ (if a cell is to hold a single item):


In [57]:
df.loc[0,'entities']

['A', 'B']

In [58]:
df.loc[0,'entities'] = "Jimmy"

In [59]:
df.loc[0,'entities']

'Jimmy'

The  [col][row] method results in warnings, but still works in most situations (unless it ceases to be supported):

In [60]:
df['entities'][0] = "Sam"

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [61]:
df.loc[0,'entities']

'Sam'

Note, the `dataframe.loc[row, "column"]` method is preferred by python's inner-workings, although it can sometimes be more complicated to code if you're working within nested for-loops.  If you're trying to set values for an entire column, you may not see this warning message.  

In [None]:
for i in range(len(df)): #for each dataframe in the list
    df[i]['entities_1'] = df[i]['entities']

__Use `.at[]` if attempting to assign a list (of multiple items) into a cell__:

In [62]:
df.at[0,'entities']

'Sam'

In [63]:
df.at[0,'entities'] = ["New1", "New2"]

In [64]:
df['entities'][0] 

['New1', 'New2']

The above code works.   


Whereas, .loc doesn't work; _Here is the error code you'll get._

In [65]:
df.loc[0,'entities'] = ["New1", "New2"]

ValueError: Must have equal len keys and value when setting with an iterable

```ValueError: Must have equal len keys and value when setting with an iterable```

`.loc` doesn't work if you want to assign a cell in a dataframe when there is to be more than one item in the cell. 

Note, the [col][row] method also results in warnings, but still works (for now):

In [66]:
df['entities'][0] = ["Very New 1", "Very New 2"]

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [67]:
df['entities'][0] 

['Very New 1', 'Very New 2']

How to replace that cell with 'None' (nothing, zero, nada, and no data type either):

In [68]:
df.loc[0,'entities'] = None
#df['entities'][0] = None #also works.
#df.at[0,'entities'] = None #also works.

In [69]:
df.loc[0,'entities']

(No result displays, because there is nothing there).

In [70]:
df

Unnamed: 0,Article_ID,number,sentence,entities
0,A12,4,A shares 4 friends with B.,
1,A13,7,C gives D 7 apples.,"[C, D]"


In [71]:
df.at[0,'entities'] = ["A", "B"]

___
<a class="anchor" id="fourth-bullet"></a>

### How to change data types in DataFrames (not an exhaustive list): 

Note, what works for DataFrames, may not work for lists:

For example, to convert into a string:

In [72]:
df['number'] = df['number'].astype(str)
df[0:3]

Unnamed: 0,Article_ID,number,sentence,entities
0,A12,4,A shares 4 friends with B.,"[A, B]"
1,A13,7,C gives D 7 apples.,"[C, D]"


In [73]:
type(df['number'][0])

str

This `.astype()` command does NOT work on lists however:

In [74]:
list1[1] = list1[1].astype(str)

AttributeError: 'int' object has no attribute 'astype'

In [75]:
for i in range(len(list1)):
    list1[i].astype(int)

AttributeError: 'int' object has no attribute 'astype'

The above errors, `AttributeError: 'int' object has no attribute 'astype'` , are due to the fact that the command `astype()` doesn't apply to this data type.  
Indeed, `astype()` is meant for casting pandas objects into specif dtypes.

To convert into an integer:

In [76]:
df['number'] = df['number'].astype(int)
type(df['number'][0])

numpy.int32

To convert into a floating point number (more robust than 'integer'):

In [77]:
df['number'] = df['number'].astype(float)
print(type(df['number'][0]))
print(df['number'][0])

<class 'numpy.float64'>
4.0


___
<a class="anchor" id="fifth-bullet"></a>

### DataFrames containing Lists:

In our Pandas DataFrame example, the following cell contains a list of items: 

In [78]:
df.loc[1,'entities'] #the cell is a list of 2 items. 

['C', 'D']

How to select an item within a list within a cell of a Pandas DataFrame: 

In [79]:
df.loc[1,'entities'][0] #Select just the first item.

'C'

In [80]:
df.loc[1,'entities'][1] #Select only the second item. 

'D'

How to select from the end of a list, working backwards from the bottom of the list:

In [81]:
df.loc[1,'entities'][-1] #Select the last item. 

'D'

In [82]:
df.loc[1,'entities'][-2] #Select the second-to-last item. 

'C'

How to reset an individual item within a list within a cell in a dataframe: Use `.loc` OR `.at`. 

In [83]:
df.loc[0,'entities'][0] = "Friend A"

In [84]:
df.loc[0,'entities'] #the cell is a list of 2 items. 

['Friend A', 'B']

There, we just changed the 1st item in the list. 

In [85]:
df.at[0,'entities'][1] = "Friend B"

In [86]:
df.loc[0,'entities'] #the cell is a list of 2 items. 

['Friend A', 'Friend B']

Now, we just changed the other item in the list. 

___

### Lists containing DataFrames:

You may encounter scenarios where you need to work with a list _consisting of dataframes_.  

That's right, you can create __a list of dataframes__.  For example, let's say for each input, a model outputs a whole dataframe of results.  Or, perhaps you want to work with different patients' testing results separately, rather than joining them in one massive table. 

Here's how:

In [87]:
#create a dataframe. 
df = pd.DataFrame({'Article_ID': ['A12','A13'],
                        'number': [4, 7],
                        'sentence': ["A shares 4 friends with B.", "C gives D 7 apples."],
                        'entities': [['A','B'],['C','D']]})
df

Unnamed: 0,Article_ID,number,sentence,entities
0,A12,4,A shares 4 friends with B.,"[A, B]"
1,A13,7,C gives D 7 apples.,"[C, D]"


In [88]:
#initialize a blank list.
df_list = []

In [89]:
#add the dataframe as the first item in the list:
df_list = [df]

The list has 1 item so far:

In [90]:
len(df_list)

1

Viewing the list:

In [91]:
df_list

[  Article_ID  number                    sentence entities
 0        A12       4  A shares 4 friends with B.   [A, B]
 1        A13       7         C gives D 7 apples.   [C, D]]

Viewing the first item in the list:

In [92]:
df_list[0]

Unnamed: 0,Article_ID,number,sentence,entities
0,A12,4,A shares 4 friends with B.,"[A, B]"
1,A13,7,C gives D 7 apples.,"[C, D]"


Here we see that we've placed a pandas dataframe into a list, as the first item.

When that item is selected, it retains it's object shape, style, formatting, and properties.

Now, let's add a new item.

In [93]:
#create a dataframe. 
next_item  = pd.DataFrame({'Article_ID': ['A14','A15'],
                        'number': [2, 3],
                        'sentence': ["E introduced 2 friends to A.", "F bought 3 plums from D."],
                        'entities': [['E','A'],['F','D']]})
next_item 

Unnamed: 0,Article_ID,number,sentence,entities
0,A14,2,E introduced 2 friends to A.,"[E, A]"
1,A15,3,F bought 3 plums from D.,"[F, D]"


How to append a dataframe item to a list:  it works just like any other item, using the `.append()` list command.

In [94]:
df_list.append(next_item)

Viewing the new list (now with 2 items):

In [95]:
df_list

[  Article_ID  number                    sentence entities
 0        A12       4  A shares 4 friends with B.   [A, B]
 1        A13       7         C gives D 7 apples.   [C, D],
   Article_ID  number                      sentence entities
 0        A14       2  E introduced 2 friends to A.   [E, A]
 1        A15       3      F bought 3 plums from D.   [F, D]]

Viewing/Selecting the next item in the list:

In [96]:
df_list[1]

Unnamed: 0,Article_ID,number,sentence,entities
0,A14,2,E introduced 2 friends to A.,"[E, A]"
1,A15,3,F bought 3 plums from D.,"[F, D]"


Here, we see that we've added a second item to the list.

The second item is also a dataframe.

Note, we could have also used the `.append()` command to input the very first item in the list. That would be better if we were using this within a loop itself. 

___

How to select a specific __cell__ within a __dataframe__ within a __list__:

In [97]:
df_list[1]['sentence'][0]

'E introduced 2 friends to A.'

This returns the following cell:  
- from list, `df_list`, the second item (`[1]`), from column = `'sentence'`, from the first row (`[0]`). 

Another example:  How to select a specific __cell__ within a __dataframe__ within a __list__:

In [98]:
df_list[0]['sentence'][1]

'C gives D 7 apples.'

This returns the following cell:  
- from list, `df_list`, the first item (`[0]`), from column = `'sentence'`, from the second row (`[1]`). 

Another method, `.loc[]`:  How to select a specific __cell__ within a __dataframe__ within a __list__:

In [99]:
df_list[0].loc[1, 'sentence']

'C gives D 7 apples.'

This gives the same result as above, but using the `.loc[]` method, instead.
- Remember, `df_list[0]` is itself a dataframe.

#### Basic ranking:

How to select the row with the highest value for `number` column.

In [100]:
df_list[1].nlargest(1, 'number')

Unnamed: 0,Article_ID,number,sentence,entities
1,A15,3,F bought 3 plums from D.,"[F, D]"


How to select the row with the lowest value for `number` column.

In [101]:
df_list[1].nsmallest(1, 'number')

Unnamed: 0,Article_ID,number,sentence,entities
0,A14,2,E introduced 2 friends to A.,"[E, A]"


___
<a class="anchor" id="seventh-bullet"></a>

## Saving Items and Objects  

There are many ways to save objects in Python for future use, for example, one can save csv's, tables, models, images, pdf's, and .py files, to name a few.  However, some methods are easier than others.  

Here are a few methods for saving items _conveniently_.

___

#### Pointing vs. Copying:
Copying objects as other objects (does not save outside of your environment):  

- In Python, saving objects as new objects (A=B) does not create a 'different' object, it simply adds another name to the original object.  Both A and B then point to the same object.  So, for example, if you did...
- A = 4
- B = A
- C = B
- D = C

All those objects, A, B, C, & D, are pointing to the same physical location of data (= 4). 

Therefore, when you re-assign __one__ of them, they __all__ get re-assigned.

In [102]:
list2 = ['Book 1', "Book 2", """Book X"""]
list2

['Book 1', 'Book 2', 'Book X']

In [103]:
list2new = list2
print(list2new)

['Book 1', 'Book 2', 'Book X']


In [104]:
list2[2] = "Change this item"
print(list2)

['Book 1', 'Book 2', 'Change this item']


In [105]:
print(list2new)

['Book 1', 'Book 2', 'Change this item']


As we can see, `list2new` points back to the original object, `list2`.  It's just another name for it.  So, when `list2` changed, so did `list2new`.  

__How to resolve this?__  

Make a COPY of the original item, instead:  

In [106]:
list2 = ['Book 1', "Book 2", """Book X"""]
list2new = list2.copy()
print(list2new)

['Book 1', 'Book 2', 'Book X']


In [107]:
list2[2] = "Change this item"
print(list2)

['Book 1', 'Book 2', 'Change this item']


In [108]:
print(list2new)

['Book 1', 'Book 2', 'Book X']


As we can see, `list2new` remains untouched, as it was created disconnected from `list2`'s physical data location.  

#### Deleting temporary objects:

You can delete objects to free up computing memory, and to reset object names if you want to use a names again for another purpose.

In [109]:
del(list2, list2new) #deletes these objects.

In [110]:
list2

NameError: name 'list2' is not defined

The error message, `NameError: name 'list2' is not defined`, is because we have just now erased `list2`. It no longer exists in our environment.  

_ProTip: Be careful deleting items that are difficult to re-create!_

___

#### Directories

Directory name is not needed if active python code (such as this ipynb workbook) and data are in the same physical location on a machine or server. 

In [None]:
#Loading a file which is in same directory as code:
filename="File_Name_Goes_Here"
df = pd.read_csv(filename+".csv")

#Loading a file from a different directory as code:
filename="File_Name_Goes_Here"
directory="/home/" #(not needed if ipynb and data are in the same location).
df = pd.read_csv(directory+filename+".csv")

    #Windows directory path example:('C:/Users/YourUserName/My Working Folder/')
    #Linux directory path example:('/home/YourAdminName/My Working Folder/')

___

#### Automatic Filenames


How to write a __base__ directory and filename once, and re-use those fundamental names again and again (saves time).

This is helpful if working with batch jobs: 

You only need to change the input filename, and all output filenames will automatically update.

In [None]:
filename="File_Name_Goes_Here"
df = pd.read_csv(filename+".csv")

Then, let's say we do some stuff to the input data, altering it, resulting in the output objects, 'df2', 'df2margins', and 'df2errors'.

You can use the following code to simply and easily save those outputs while at the same time referencing the initial input filename:

In [None]:
#Save the results:
df2.to_csv(filename+'.predicted.csv')
df2margins.to_csv(filename+'.marginals.csv')
df2errors.to_csv(filename+'.error_rates.csv')

The resulting files will be:

- "File_Name_Goes_Here.predicted.csv",
- "File_Name_Goes_Here.marginals.csv", and 
- "File_Name_Goes_Here.error_rates.csv".

<a class="anchor" id="appendix"></a>
___

# Appendix:

___

<a class="anchor" id="reference"></a>

### References:

[1] https://www.burtchworks.com/2019/08/21/2019-sas-r-or-python-survey-update-which-tool-do-data-scientists-analytics-pros-prefer/

[2]  https://www.burtchworks.com/2019/12/10/metis-data-scientists-on-pythons-advantages-growing-popularity/

[3]  https://pandas.pydata.org/pandas-docs/stable/index.html#  


### More Resources:

- https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html
    - The official Pandas website, with 'how to's' on indexing and selecting data.  
- https://pandas.pydata.org/pandas-docs/stable/getting_started/comparison/comparison_with_r.html
    - The official Pandas website contains this great reference page displaying comparable code and libraries between Python's Pandas, and R / R libraries.  
- https://github.com/ageron/handson-ml
    - An incredibly well-structured and maintained educational resource for all things machine learning, and beyond.  
- https://pandas.pydata.org/pandas-docs/stable/user_guide/cookbook.html
    - A page of Python 'idioms' – how to do 'if/else' statements in 1 line of code, and more.  




### Machine Information:  

This workbook was run on... 

In [111]:
#Python Version:
import sys
print(sys.version)

3.7.3 (default, Apr 24 2019, 15:29:51) [MSC v.1915 64 bit (AMD64)]


In [112]:
#Timestamp:
import datetime
datetime.datetime.now().strftime("%a, %d %B %Y %H:%M:%S")

'Wed, 12 February 2020 21:44:46'

In [113]:
#Operating System:
import os
print(os.name)
print(sys.platform)

nt
win32


___

__Author:__ _Sherman6_  
__Updated:__ _2020, February_
