# 1. Environment Setup

## Development Environment in CSIS 3290


- Anaconda (used as the package manager)

- Jupyter notebook 

Some of you may prefer to use a more richly featured integrated development environment (IDE)

- [Pycharm](https://www.jetbrains.com/pycharm/download/?section=mac)

- Or [VSCode](https://code.visualstudio.com)

- Or [Spider](https://www.spyder-ide.org)


## DataCamp


[DataCamp.com](https://datacamp.com) is an online learning platform specializing in data science and analytics skills. It offers interactive courses and tutorials on a variety of topics, including programming (Python, R, SQL), data visualization, machine learning, and more. The interactive courses are particularly beneficial for students new to computer programming, easing their learning curve. 

Students enrolled in CSIS 3290 will have free access to DataCamp throughout the semester. Our assignments will also be hosted on DataCamp.com.


## Other Learning Resources for Python

- [Python for Data Analysis 3rd Edition by Wes McKinney](https://wesmckinney.com/book/) 

  Github: https://github.com/wesm/pydata-book 
  

  <p> <br> </p>

  

- [Python Data Science Handbook by Jake VanderPlas](https://jakevdp.github.io/PythonDataScienceHandbook/)

  Github: https://github.com/jakevdp/PythonDataScienceHandbook
  
  
The books can be easily found in [O'Reilly Learning Platform](https://www.oreilly.com/). Make sure that you use the school email to login. 

Additionally, remember that the official documentation for Python packages, online resources, and AI assistants like ChatGPT can serve as valuable tutors while learning Python.



## How to succeed in this course

1. Carefully review the course outline to grasp its structure thoroughly.
2. Recognize the significance of academic integrity.
3. Approach the text and code in this course actively. Engage with them by following along and experimenting. Building muscle memory through hands-on practice will prove more valuable than merely reading about it or listening to the instructor rambling in class. &#x1F600;
4. If you lack technical expertise and struggle with basic computer proficiency, it's essential to take proactive steps to improve as soon as possible. Start by learning basic tasks such as file compression, decompression, and file saving. These fundamental skills are essential prerequisites for this course.
5. Do some research before posing technical inquiries. I encourage you to harness the power of AI and the vast resources available on the Internet to deepen the understanding of concepts. Being proficient in data science isn't just about memorizing every tool or command; it's about mastering the skill of efficiently finding the information you're unfamiliar with. In my view, honing this ability is even more critical than the course material itself.

## IPython and Jupyter

The [IPython project](https://ipython.org/) began in 2001 as Fernando Pérez’s side project to make a better interactive Python interpreter. Over the subsequent 20 years it has become one of the most important tools in the modern Python data stack. While it does not provide any computational or data analytical tools by itself, IPython is designed for both interactive computing and software development work. It encourages an execute-explore workflow instead of the typical edit-compile-run workflow of many other programming languages. It also provides integrated access to your operating system’s shell and filesystem; this reduces the need to switch between a terminal window and a Python session in many cases. Since much of data analysis coding involves exploration, trial and error, and iteration, IPython can help you get the job done faster.

In 2014, Fernando and the IPython team announced the [Jupyter project](https://jupyter.org/), a broader initiative to design language-agnostic interactive computing tools. The IPython web notebook became the Jupyter notebook, with support now for over 40 programming languages. The IPython system can now be used as a kernel (a programming language mode) for using Python with Jupyter.

IPython itself has become a component of the much broader Jupyter open source project, which provides a productive environment for interactive and exploratory computing. Its oldest and simplest "mode" is as an enhanced Python shell designed to accelerate the writing, testing, and debugging of Python code. You can also use the IPython system through the Jupyter notebook.

The Jupyter notebook system also allows you to author content in Markdown and HTML, providing you a means to create rich documents with code and text.

## How to launch the IPython Shell


Start by launching the IPython interpreter by typing **`ipython`** on the command line; alternatively, if you've installed a distribution like Anaconda, there may be a launcher specific to your system. Once you do this, you should see a prompt like the following:

```ipython
Python 3.9.2 (v3.9.2:1a79785e3e, Feb 19 2024, 01:06:15) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]:
```


## How to launch the Jupyter Notebook or JupyterLab

The Jupyter Notebook is a browser-based graphical interface to the IPython shell, and builds on it a rich set of dynamic display capabilities.
As well as executing Python/IPython statements, notebooks allow the user to include formatted text, static and dynamic visualizations, mathematical equations, JavaScript widgets, and much more.
Furthermore, these documents can be saved in a way that lets other people open them and execute the code on their own systems.

Though you'll view and edit Jupyter notebooks through your web browser window, they must connect to a running Python process in order to execute code.
You can start this process (known as a "kernel") by running the following command in your system shell:

```
$ jupyter notebook
```

This command will launch a local web server that will be visible to your browser.


Upon issuing the command, your default browser should automatically open and navigate to the listed local URL;
the exact address will depend on your system. If the browser does not open automatically, you can open a window and manually open this address (*http://localhost:8888/* in this example).

If you want a more feature-rich interface, you can try JupyterLab instead of Jupyter notebook. They differ primarily in their user interface, functionality, and flexibility. Jupyter Notebook has a simpler, more lightweight interface, while JupyterLab offers a more versatile and feature-rich interface.

```
$ jupyter lab
```

## How to quickly access the documentation in IPython or Jupyter

The Python language and its data science ecosystem are built with the user in mind, and one big part of that is access to documentation.
Every Python object contains a reference to a string, known as a *docstring*, which in most cases will contain a concise summary of the object and how to use it.
Python has a built-in `help` function that can access this information and prints the results.
For example, to see the documentation of the built-in `len` function, you can do the following:

```ipython
In [1]: help(len)
Help on built-in function len in module builtins:

len(obj, /)
    Return the number of items in a container.
```

Depending on your interpreter, this information may be displayed as inline text or in a separate pop-up window.

## Magic Commands in IPython or Jupyter

We have something called *magic commands*, and are prefixed by the `%` character. These magic commands are designed to succinctly solve various common problems in standard data analysis. Magic commands come in two flavors: *line magics*, which are denoted by a single `%` prefix and operate on a single line of input, and *cell magics*, which are denoted by a double `%%` prefix and operate on multiple lines of input.

#### Running External Code: %run
As you begin developing more extensive code, you will likely find yourself working in IPython for interactive exploration, as well as a text editor to store code that you want to reuse.
Rather than running this code in a new window, it can be convenient to run it within your IPython session.
This can be done with the `%run` magic command.

For example, imagine you've created a *myscript.py* file with the following contents:

```python
# file: myscript.py

def square(x):
    """square a number"""
    return x ** 2

for N in range(1, 4):
    print(f"{N} squared is {square(N)}")
```

You can execute this from your IPython session as follows:

```ipython
In [6]: %run myscript.py
1 squared is 1
2 squared is 4
3 squared is 9
```

Note also that after you've run this script, any functions defined within it are available for use in your IPython session:

```ipython
In [7]: square(5)
Out[7]: 25
```

There are several options to fine-tune how your code is run; you can see the documentation in the normal way, by typing **`%run?`** in the IPython interpreter.

#### Timing Code Execution: %timeit
Another example of a useful magic function is `%timeit`, which will automatically determine the execution time of the single-line Python statement that follows it.
For example, we may want to check the performance of a list comprehension:

```ipython
In [8]: %timeit L = [n ** 2 for n in range(1000)]
430 µs ± 3.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
```

The benefit of `%timeit` is that for short commands it will automatically perform multiple runs in order to attain more robust results.
For multiline statements, adding a second `%` sign will turn this into a cell magic that can handle multiple lines of input.
For example, here's the equivalent construction with a `for` loop:

```ipython
In [9]: %%timeit
   ...: L = []
   ...: for n in range(1000):
   ...:     L.append(n ** 2)
   ...: 
484 µs ± 5.67 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
```

We can immediately see that list comprehensions are about 10% faster than the equivalent `for` loop construction in this case.

# How to write Markdown

If you have ever taken any classes in HTML, you will Markdown much easier. 

https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet

## How to use Anaconda

[Anaconda](https://www.anaconda.com/) is a popular open-source distribution of the Python and R programming languages, along with a collection of libraries and tools commonly used in data science, machine learning, and scientific computing. It simplifies the process of setting up and managing environments for these languages and their associated packages by providing a package management system called conda. Anaconda includes hundreds of pre-installed packages and can be easily expanded with additional packages from the conda repository.

One of the key features of Anaconda is its ability to create isolated environments, which allow users to work on different projects with different dependencies without conflicts. This makes it easier to manage dependencies and ensure reproducibility in data analysis and research projects.

Anaconda is widely used by data scientists, researchers, and developers to streamline their workflow and access a comprehensive set of tools and libraries for data analysis, visualization, machine learning, and more.


I'll cover the specifics during our class session. If you happen to miss it, you can catch up by watching a video tutorial like [this](https://www.youtube.com/watch?v=MUZtVEDKXsk) or finding similar resources online. I'm sure you'll find plenty of helpful tutorials available. Additionally, feel free to reach out to your diligent classmates or me for assistance. However, I do encourage you to conduct your due diligence beforehand. 










## How to use Anaconda + IDE (Pycharm/VSCode/Spider)

I'll demonstrate the specifics during our class session. Again, attendance is important in this class. 

If you miss the class, you can refer to tutorials:

- [VSCode + Conda](https://code.visualstudio.com/docs/python/environments)
- [PyCharm + Conda](https://docs.anaconda.com/free/working-with-conda/ide-tutorials/pycharm/)

Using Spider is similar to using Jupyter notebook, you will have access to the packages in the same environment where you launch the Spider or Jupyter Notebook. 

# 2. Python Fundamentals

1. Download/Clone the 3rd edition notebooks from https://github.com/wesm/pydata-book

2. Explore the [Ch2](https://wesmckinney.com/book/python-basics) and [Ch3](https://wesmckinney.com/book/python-builtin) while practicing by coding alongside the relevant Jupyter notebooks. Use these chapters to build a foundational understanding of Python's syntax and data structures.

3. Our plan is to cover the Python fundamentals during Week 1 and Week 2, followed by the introduction of the Python Pandas package in Week 3.
