## Getting Started with Jupyter Notebooks and Python for Business Data Analytics
### Table of Contents
1. [Introduction](#introduction)
2. [Installing and importing modules](#modules)
3. [Verify datatypes and expected output](#verify)
4. [How to troubleshoot code and get help](#help)
5. [Functions and Try-Except Structures](#functions)
<!-- 
## Introduction <a name="introduction"></a>
Content for the introduction section.

## Section One <a name="section-one"></a>
Content for the first section.

## Section Two <a name="section-two"></a>
Content for the second section. -->

## Introduction <a name="introduction"></a>
This notebook will guide you through the basics of using Jupyter notebooks and provide a quick review of common Python concepts.


In [2]:
# Use the keyboard as much as possible
#
# To execute a cell, use the ctrl-enter (or shift-enter) keystroke (Windows). 
# (command-enter for MacOS?) This is the fastest way.
# Do not use the Run button in the toolbar or the Run Cells from the Cell menu.
lst = [1,2,'Three',{'Four':4},('Five','Six',7)]
#
# Viewing output:
# If you want to see the result of your cell, either print() the thing you want to see:
print('Using the print command:',lst)
# or put the variable in the last line of the cell.
lst

Using the print command: [1, 2, 'Three', {'Four': 4}, ('Five', 'Six', 7)]


[1, 2, 'Three', {'Four': 4}, ('Five', 'Six', 7)]

In [3]:
# Use the command mode keystrokes:
# 'Esc' puts the cell into command mode (turns blue)
# 'Enter' puts the cell into edit mode (turns green)
# Try this sequence:
# - hit the Esc key
# - then hit the 'b' key
# This inserts a cell below. Very fast and useful.

In [4]:
# Test

In [5]:
# Test

In [6]:
# Inesert above
# use the key stroke: Esc - a
# Inserts a cell above.

In [7]:
# Delete a cell
# Esc dd

In [8]:
# Undo last action
# Esc z

In [9]:
# Change cell type:
# Esc m (markdown)
# Esc y (code)
#
# Then hit enter to edit the cell.

In [10]:
# Comment and uncomment multiple lines:
# Highlight the lines and push: Ctrl - /

# This is an embedded for loop:
for i in range(10):
    # For every i = 0,1,2,...9, do this    
    for j in range(2):
        # Print current i & 0, then i & 1
        print(i,j)

0 0
0 1
1 0
1 1
2 0
2 1
3 0
3 1
4 0
4 1
5 0
5 1
6 0
6 1
7 0
7 1
8 0
8 1
9 0
9 1


In [11]:
# Restart the kernel and clear all outputs: Esc 0 0 (this is zero zero)
# This is a fresh beginning for the notebook.
# This variable will be gone until you rerun the cell that defines it.
lst

[1, 2, 'Three', {'Four': 4}, ('Five', 'Six', 7)]

## Instaling and Loading Modules <a name="modules"></a>
Most Python packages we will use are already installed and we can just import them. Sometimes, we have to install a package before we can use it.

In [12]:
# Checking Python kernel version
from platform import python_version
python_version()

'3.12.5'

In [15]:
# If a module is installed, then we can just import it
import pandas as pd
# No error, so we are OK.

In [16]:
# Try to import something that isn't installed 
import pdftextract
# Error:  This likely is not installed.

ModuleNotFoundError: No module named 'pdftextract'

In [17]:
# Let's install it
%pip install pdftextract
# Now we can import it
import pdftextract # no error
# Get version information about pdftextract
%pip show pdftextract

Collecting pdftextract
  Downloading pdftextract-0.0.5-py3-none-any.whl.metadata (3.5 kB)
Downloading pdftextract-0.0.5-py3-none-any.whl (1.4 MB)
   ---------------------------------------- 0.0/1.4 MB ? eta -:--:--
   ------------------------------------- -- 1.3/1.4 MB 8.4 MB/s eta 0:00:01
   ---------------------------------------- 1.4/1.4 MB 6.5 MB/s eta 0:00:00
Installing collected packages: pdftextract
Successfully installed pdftextract-0.0.5
Note: you may need to restart the kernel to use updated packages.
Name: pdftextract
Version: 0.0.5
Summary: a very fast and efficient text and image pdf extractor.
Home-page: https://github.com/Bnilss/pdftextract
Author: Iliass Benali
Author-email: iliassben97@gmail.com
License: UNKNOWN
Location: c:\Users\isabe\OneDrive\Desktop\GSB-570\.venv\Lib\site-packages
Requires: 
Required-by: 
Note: you may need to restart the kernel to use updated packages.


## Verify datatypes and expected outputs <a name="verify"></a>
Spend lots of time verifying the output from a cell is exactly what you expect.  
Prove to yourself that you believe the output.

In [18]:
# Many problems arise from not understanding datatypes.
# Make sure you know the type of every variable. 
# This often leads to fixing somethign that is broken.
# I use the type() function a lot.
df = pd.DataFrame(['1',2.0,-3],columns=['col_0'],index=['row1', 'row2', 'row3'])
print(df)
type(df)

     col_0
row1     1
row2   2.0
row3    -3


pandas.core.frame.DataFrame

In [19]:
# Append this DataFrame to the lst from above
lst.append(df)
# Now look at the type of each item
for item in lst:
    print('Print the item:',item,', Type of item:',type(item))

Print the item: 1 , Type of item: <class 'int'>
Print the item: 2 , Type of item: <class 'int'>
Print the item: Three , Type of item: <class 'str'>
Print the item: {'Four': 4} , Type of item: <class 'dict'>
Print the item: ('Five', 'Six', 7) , Type of item: <class 'tuple'>
Print the item:      col_0
row1     1
row2   2.0
row3    -3 , Type of item: <class 'pandas.core.frame.DataFrame'>


## Functions and Try-Except structures <a name="functions"></a>
Sometimes it really helps to organize your code into functions.<BR>
Additionally, using the Try-Except stucture can handle runtime errors or glitches in data.

In [20]:
# This cell defines the function. Once you execute this cell, 
# it will remain in memory ready to use.
# Define a function that takes the first number to the power of the 2nd
def power(x,y):
    # Check out the try-except structure
    try:
        # Test the first variable for type = integer
        if not isinstance(x,int):
            return('Please give me integers')
        else:
        # Otherwise, do the math and return it
            return(x**y)
    # Here is where I catch an error if the top block doesn't work
    except Exception as e:
        # Print what the error was
        return('Something went wrong:', e)

In [21]:
# Test it: Notice the difference bewteen 2 and '2'
print('Works great:', power(2,2)) # works great
print('Catches first variable error:', power('2', 2)) # Works great
print(power(2,'2')) # Catches the 2nd variable error, but code still executes with no errors

Works great: 4
Catches first variable error: Please give me integers
('Something went wrong:', TypeError("unsupported operand type(s) for ** or pow(): 'int' and 'str'"))


### Variable scope in a Jupyter Notebook

Varibles declared in a cell outside a function are global variables in your notebook

In [22]:
i=10
j=20

for i in range(10):
    print(i, j)
    j=5
print(j)

0 20
1 5
2 5
3 5
4 5
5 5
6 5
7 5
8 5
9 5
5


In [23]:
print(i,j)

9 5


Varibles in functions are local to that function

In [24]:
def addtoi(x: int):
    x=x+1
    return x+i

In [25]:
x = addtoi(5)
x

15

### End:
Hopefully, that was a good review of Python and useful tips on using jupyter notebooks.