<h1>First Steps with Python</h1>
<h3>Overview</h3>
<p>
    <ul>
        <li>Overview of Python, Anaconda, and Jupyter Notebooks</li>
        <li>Data Values and interpretations</li>
        <li>Variables, Functions, and Arguments</li>
        <li>Comparisons, Conditional, and Loops</li>
        <li>Libraries</li>
    </ul>
</p>

<h3>Resources</h3>
<p>
    <ul>
        <li>O'Reilly Learning Platform: <a href = "https://databases.lib.wvu.edu/connect/1540334373" target ="_blank">https://databases.lib.wvu.edu/connect/1540334373</a></li>
    </ul>
</p>

<h2>Getting Started</h2>
<p>
    <ol>
        <li>Open Anaconda Navigator</li>
        <li>Open Jupyterlab</li>
        <li>Using the directory on the right, select the folder you will be working in</li>
        <li>Start a new notebook</li>
        <li>Select the python kernel</li>
        <li>Click on File > <strong>Save Notebook As</strong> and give the file a name</li>
        <li>Click on File > <strong>Save Current Workspace As</strong> and save the workspace</li>
    </ol>
</p>

## Working Directory
The working directory in Python is the folder where you are working. Hence, it’s the place (the environment) where you have to store your files of your project in order to load them or where your Python objects will be saved.

## Functions and Arguments

### Functions:
Like most computer software, Python allows you to run commands. Commands in Python are referred to as functions. When you install and call a new package, you will have access to more functions that you can use.

These functions are ready-made tools like you would have a toolbox, and like how these different tools perform various tasks like measuring, cutting, or fixing things, different functions perform different tasks like filtering, sorting, or calculation.

In [None]:
import random

#random.sample() lets us randomly select a specified number of elements from a defined population.
random.sample(population=range(1,11),k=10)

### Arguments:
Arguments are the values and parameters that are acted on by the function. They are the information that you give to a function to tell it what to do.

In [None]:
# arguments in a function have an order. Entering information in the order will allow you to skip the argument name

#Randomly select 5 numbers from the range 1 to 10

random.sample(range(1,11), 5)

<h3>Commenting</h3>
<p>Since use will be performing several operations in a single document and
even in a particular code chunk, it becomes important to document what
processes you were performing or make notes to use for yourself or
others about your intentions.</p>
<p>Entering a hastag (#) into your code will comment anything that comes after for one single line.</p>

## Documentation
- For every package you want to use in Python, there will be a documentation website or document. Explore these documents to see the available functions and their default arguments
- Get help with a package, function, or argument within Python. Using help(name of function)

In [None]:
#Help function

help(random.sample)

In [None]:
help(random.choices)

In [None]:
#random.choices() draws a sample with replacement, meaning elements can be repeated in the result.
random.choices(range(1, 11), k=15)

## Objects
Objects allow you to store and work on data (numbers, words, tables, and more).

- numeric value:                     numValue = 400
- character:                         chrValue = “Hello World”
- results of running function(s)：   resultFunction = function(xxx)
- vector：                           vecValue = [1,2,3]
- data frame                         dfValue = pd.read_csv("data.csv")

## Assignment Operator
The assignment operator (=) allows you to create an object.

In [None]:
a=35
b=45

print(a)

print(a+b)

sample_result = random.sample(range(1,11), 5)
print(sample_result)

## Naming Conventions
- Use descriptive and meaningful names that indicate the purpose of the object
- Use lowercase letters.
- Use underscores to separate words (e.g., my_variable_name).
- Avoid using reserved words or functions (e.g., “if,” “else,” “for,” “function”).

<h2>Data Types in Python</h2>
<ul>
    <li>Integer or Float - used for numbers which can be integers (whole numbers) or real numbers (numbers with decimal points).</li>
    <li>String – used for text, words, and strings of characters. Enclose in double (““) or single (’’) quotes. “26501”</li>
    <li>Category – used to represent categorical data with predefined levels.</li> 
    <li>Date – used for handling dates, times, and time intervals.</li>
    <li>Boolean –	used for decision-making and represented by binary values, typically True or False. </li>
</ul>

In [None]:
type(4) #integer

In [None]:
type(4.5673) #float

In [None]:
#Type functions

type("26501") #string

In [None]:
#Strings - anything entered in "" will be interpreted as a string

hello = "Hello World"
type(hello)

In [None]:
type(True) #boolean

## Entering Data

### List

ordered collections of data items

listSyntax = [object1, object2, object3]

In [None]:
#Lists

num_list = [6, 1, 2, 5, 7, 9]
cities = ['Morgantown', 'Charleston', 'Pittsburgh', 'New York City']

In [None]:
#Indexing
#method returns the position at the first occurrence of the specified value.
#syntax []

#numbered order starts at 0
print(num_list[0]) 

print(num_list[0]+10) 

print(cities [0]) 

#range
print(cities [0:2])


#force an error messages with print(cities[4])
print(cities[4])


In [None]:
cities.append('Fairmont')

print(cities)

In [None]:
print(cities[4])

print(cities[5])

In [None]:
print(num_list)

num_list.extend([8,4])

print(num_list)

num_list.insert(1,10)

print(num_list)

In [None]:
mix_list = [1,2,3, 'hello', 'world']
mix_list[0]+1

### Array

ordered collections of data items of the same type

provided by *numpy* library

arraySyntax = np.array([object1, object2, object3])

In [None]:
import numpy as np
num_arr = np.array([1,2,3,4])

print(num_arr[0])
print(num_arr[0]+10)

mix_arr = np.array([1,2,3,'hello','world'])
print(mix_arr[0]+1)


In [None]:
np.append(num_arr, 10)

print(num_arr)

In [None]:
num_arr2 = np.array([6, 8,10,12])

In [None]:
np.concatenate((num_arr, np.array([6, 8,10,12]) ))

In [None]:
np.insert(num_arr, [1,3], [77,88])

### Data Frame

two-dimensional sequence of data variables (columns) and observations (rows). While each variable in a data frame typically contains data of the same type, different variable can contain different data types.

dataFrameSyntax = pd.DataFrame(column1, column2, column3)

In [None]:
title = ["Star Wars", "The Empire Strikes Back", "Return of the Jedi"]
year = [1977, 1980, 1983]
length_min = [121, 124, 133]
box_office_mil = [787, 534, 572]

In [None]:
import pandas as pd

sw_df = pd.DataFrame({
    "Title": title,
    "Year": year,
    "Length_min": length_min,
    "Box_Office_mil": box_office_mil
})

sw_df

In [None]:
print(sw_df['Year'].values)
type(sw_df['Year'].values)

### Export the Data Frame
Once you are done entering your data, you can export it to your working directory. The function is to_csv()


object.to_csv(“name of file.extension”)

In [None]:
sw_df.to_csv('starwars.csv')

### Individual descriptive statistics

In [None]:
# the total box office revenue
print(sw_df['Box_Office_mil'].mean())

#the mean box office revenue
print(sw_df['Box_Office_mil'].sum())

#what is the median box office revenue
print(sw_df['Box_Office_mil'].median())

# what is the standard deviation of the box office revenue
print(sw_df['Box_Office_mil'].std())

In [None]:
title = ["Star Wars", "The Empire Strikes Back", "Return of the Jedi", 'movie_name']
year = [1977, 1980, 1983, 2000]
length_min = [121, 124, 133, 115]
box_office_mil = [787, 534, 572,'test']

In [None]:
import pandas as pd

sw_df = pd.DataFrame({
    "Title": title,
    "Year": year,
    "Length_min": length_min,
    "Box_Office_mil": box_office_mil
})

sw_df

In [None]:
sw_df.dtypes