<!--BOOK_INFORMATION-->
<img align="left" style="padding-right:10px;" src="figures/PDSH-cover-small.png">

*This notebook contains an excerpt from the [Python Data Science Handbook](http://shop.oreilly.com/product/0636920034919.do) by Jake VanderPlas; the content is available [on GitHub](https://github.com/jakevdp/PythonDataScienceHandbook).*

*The text is released under the [CC-BY-NC-ND license](https://creativecommons.org/licenses/by-nc-nd/3.0/us/legalcode), and code is released under the [MIT license](https://opensource.org/licenses/MIT). If you find this content useful, please consider supporting the work by [buying the book](http://shop.oreilly.com/product/0636920034919.do)!*

# Capítulo 00.00

# Preface

## What Is Data Science?

This is a book about doing data science with Python, which immediately begs the question: what is *data science*?
It's a surprisingly hard definition to nail down, especially given how ubiquitous the term has become.
Vocal critics have variously dismissed the term as a superfluous label (after all, what science doesn't involve data?) or a simple buzzword that only exists to salt resumes and catch the eye of overzealous tech recruiters.

In my mind, these critiques miss something important.
Data science, despite its hype-laden veneer, is perhaps the best label we have for the cross-disciplinary set of skills that are becoming increasingly important in many applications across industry and academia.
This cross-disciplinary piece is key: in my mind, the best extisting definition of data science is illustrated by Drew Conway's Data Science Venn Diagram, first published on his blog in September 2010:

![Data Science Venn Diagram](figures/Data_Science_VD.png)

<small>(Source: [Drew Conway](http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram). Used by permission.)</small>

While some of the intersection labels are a bit tongue-in-cheek, this diagram captures the essence of what I think people mean when they say "data science": it is fundamentally an *interdisciplinary* subject.
Data science comprises three distinct and overlapping areas: the skills of a *statistician* who knows how to model and summarize datasets (which are growing ever larger); the skills of a *computer scientist* who can design and use algorithms to efficiently store, process, and visualize this data; and the *domain expertise*—what we might think of as "classical" training in a subject—necessary both to formulate the right questions and to put their answers in context.

With this in mind, I would encourage you to think of data science not as a new domain of knowledge to learn, but a new set of skills that you can apply within your current area of expertise.
Whether you are reporting election results, forecasting stock returns, optimizing online ad clicks, identifying microorganisms in microscope photos, seeking new classes of astronomical objects, or working with data in any other field, the goal of this book is to give you the ability to ask and answer new questions about your chosen subject area.

## Who Is This Book For?

In my teaching both at the University of Washington and at various tech-focused conferences and meetups, one of the most common questions I have heard is this: "how should I learn Python?"
The people asking are generally technically minded students, developers, or researchers, often with an already strong background in writing code and using computational and numerical tools.
Most of these folks don't want to learn Python *per se*, but want to learn the language with the aim of using it as a tool for data-intensive and computational science.
While a large patchwork of videos, blog posts, and tutorials for this audience is available online, I've long been frustrated by the lack of a single good answer to this question; that is what inspired this book.

The book is not meant to be an introduction to Python or to programming in general; I assume the reader has familiarity with the Python language, including defining functions, assigning variables, calling methods of objects, controlling the flow of a program, and other basic tasks.
Instead it is meant to help Python users learn to use Python's data science stack–libraries such as IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and related tools–to effectively store, manipulate, and gain insight from data.

## Why Python?

Python has emerged over the last couple decades as a first-class tool for scientific computing tasks, including the analysis and visualization of large datasets.
This may have come as a surprise to early proponents of the Python language: the language itself was not specifically designed with data analysis or scientific computing in mind.
The usefulness of Python for data science stems primarily from the large and active ecosystem of third-party packages: *NumPy* for manipulation of homogeneous array-based data, *Pandas* for manipulation of heterogeneous and labeled data, *SciPy* for common scientific computing tasks, *Matplotlib* for publication-quality visualizations, *IPython* for interactive execution and sharing of code, *Scikit-Learn* for machine learning, and many more tools that will be mentioned in the following pages.

If you are looking for a guide to the Python language itself, I would suggest the sister project to this book, "[A Whirlwind Tour of the Python Language](https://github.com/jakevdp/WhirlwindTourOfPython)".
This short report provides a tour of the essential features of the Python language, aimed at data scientists who already are familiar with one or more other programming languages.

### Python 2 vs Python 3

This book uses the syntax of Python 3, which contains language enhancements that are not compatible with the 2.x series of Python.
Though Python 3.0 was first released in 2008, adoption has been relatively slow, particularly in the scientific and web development communities.
This is primarily because it took some time for many of the essential third-party packages and toolkits to be made compatible with the new language internals.
Since early 2014, however, stable releases of the most important tools in the data science ecosystem have been fully compatible with both Python 2 and 3, and so this book will use the newer Python 3 syntax.
However, the vast majority of code snippets in this book will also work without modification in Python 2: in cases where a Py2-incompatible syntax is used, I will make every effort to note it explicitly.

## Outline of the Book

Each chapter of this book focuses on a particular package or tool that contributes a fundamental piece of the Python Data Sciece story.

1. IPython and Jupyter: these packages provide the computational environment in which many Python-using data scientists work.
2. NumPy: this library provides the ``ndarray`` for efficient storage and manipulation of dense data arrays in Python.
3. Pandas: this library provides the ``DataFrame`` for efficient storage and manipulation of labeled/columnar data in Python.
4. Matplotlib: this library provides capabilities for a flexible range of data visualizations in Python.
5. Scikit-Learn: this library provides efficient & clean Python implementations of the most important and established machine learning algorithms.

The PyData world is certainly much larger than these five packages, and is growing every day.
With this in mind, I make every attempt through these pages to provide references to other interesting efforts, projects, and packages that are pushing the boundaries of what can be done in Python.
Nevertheless, these five are currently fundamental to much of the work being done in the Python data science space, and I expect they will remain important even as the ecosystem continues growing around them.

## Using Code Examples

Supplemental material (code examples, figures, etc.) is available for download at http://github.com/jakevdp/PythonDataScienceHandbook/. This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.

We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example:

> *The Python Data Science Handbook* by Jake VanderPlas (O’Reilly). Copyright 2016 Jake VanderPlas, 978-1-491-91205-8.

If you feel your use of code examples falls outside fair use or the per‐ mission given above, feel free to contact us at permissions@oreilly.com.

## Installation Considerations

Installing Python and the suite of libraries that enable scientific computing is straightforward . This section will outline some of the considerations when setting up your computer.

Though there are various ways to install Python, the one I would suggest for use in data science is the Anaconda distribution, which works similarly whether you use Windows, Linux, or Mac OS X.
The Anaconda distribution comes in two flavors:

- [Miniconda](http://conda.pydata.org/miniconda.html) gives you the Python interpreter itself, along with a command-line tool called ``conda`` which operates as a cross-platform package manager geared toward Python packages, similar in spirit to the apt or yum tools that Linux users might be familiar with.

- [Anaconda](https://www.continuum.io/downloads) includes both Python and conda, and additionally bundles a suite of other pre-installed packages geared toward scientific computing. Because of the size of this bundle, expect the installation to consume several gigabytes of disk space.

Any of the packages included with Anaconda can also be installed manually on top of Miniconda; for this reason I suggest starting with Miniconda.

To get started, download and install the Miniconda package–make sure to choose a version with Python 3–and then install the core packages used in this book:

```
[~]$ conda install numpy pandas scikit-learn matplotlib seaborn jupyter
```

Throughout the text, we will also make use of other more specialized tools in Python's scientific ecosystem; installation is usually as easy as typing **``conda install packagename``**.
For more information on conda, including information about creating and using conda environments (which I would *highly* recommend), refer to [conda's online documentation](http://conda.pydata.org/docs/).

# Capítulo 01.00

# IPython: Beyond Normal Python

There are many options for development environments for Python, and I'm often asked which one I use in my own work.
My answer sometimes surprises people: my preferred environment is [IPython](http://ipython.org/) plus a text editor (in my case, Emacs or Atom depending on my mood).
IPython (short for *Interactive Python*) was started in 2001 by Fernando Perez as an enhanced Python interpreter, and has since grown into a project aiming to provide, in Perez's words, "Tools for the entire life cycle of research computing."
If Python is the engine of our data science task, you might think of IPython as the interactive control panel.

As well as being a useful interactive interface to Python, IPython also provides a number of useful syntactic additions to the language; we'll cover the most useful of these additions here.
In addition, IPython is closely tied with the [Jupyter project](http://jupyter.org), which provides a browser-based notebook that is useful for development, collaboration, sharing, and even publication of data science results.
The IPython notebook is actually a special case of the broader Jupyter notebook structure, which encompasses notebooks for Julia, R, and other programming languages.
As an example of the usefulness of the notebook format, look no further than the page you are reading: the entire manuscript for this book was composed as a set of IPython notebooks.

IPython is about using Python effectively for interactive scientific and data-intensive computing.
This chapter will start by stepping through some of the IPython features that are useful to the practice of data science, focusing especially on the syntax it offers beyond the standard features of Python.
Next, we will go into a bit more depth on some of the more useful "magic commands" that can speed-up common tasks in creating and using data science code.
Finally, we will touch on some of the features of the notebook that make it useful in understanding data and sharing results.

## Shell or Notebook?

There are two primary means of using IPython that we'll discuss in this chapter: the IPython shell and the IPython notebook.
The bulk of the material in this chapter is relevant to both, and the examples will switch between them depending on what is most convenient.
In the few sections that are relevant to just one or the other, we will explicitly state that fact.
Before we start, some words on how to launch the IPython shell and IPython notebook.

### Launching the IPython Shell

This chapter, like most of this book, is not designed to be absorbed passively.
I recommend that as you read through it, you follow along and experiment with the tools and syntax we cover: the muscle-memory you build through doing this will be far more useful than the simple act of reading about it.
Start by launching the IPython interpreter by typing **``ipython``** on the command-line; alternatively, if you've installed a distribution like Anaconda or EPD, there may be a launcher specific to your system (we'll discuss this more fully in [Help and Documentation in IPython](01.01-Help-And-Documentation.ipynb)).

Once you do this, you should see a prompt like the following:
```
IPython 4.0.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
In [1]:
```
With that, you're ready to follow along.

### Launching the Jupyter Notebook

The Jupyter notebook is a browser-based graphical interface to the IPython shell, and builds on it a rich set of dynamic display capabilities.
As well as executing Python/IPython statements, the notebook allows the user to include formatted text, static and dynamic visualizations, mathematical equations, JavaScript widgets, and much more.
Furthermore, these documents can be saved in a way that lets other people open them and execute the code on their own systems.

Though the IPython notebook is viewed and edited through your web browser window, it must connect to a running Python process in order to execute code.
This process (known as a "kernel") can be started by running the following command in your system shell:

```
$ jupyter notebook
```

This command will launch a local web server that will be visible to your browser.
It immediately spits out a log showing what it is doing; that log will look something like this:

```
$ jupyter notebook
[NotebookApp] Serving notebooks from local directory: /Users/jakevdp/PythonDataScienceHandbook
[NotebookApp] 0 active kernels 
[NotebookApp] The IPython Notebook is running at: http://localhost:8888/
[NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
```

Upon issuing the command, your default browser should automatically open and navigate to the listed local URL;
the exact address will depend on your system.
If the browser does not open automatically, you can open a window and manually open this address (*http://localhost:8888/* in this example).

# Capítulo 01.01

# Help and Documentation in IPython

If you read no other section in this chapter, read this one: I find the tools discussed here to be the most transformative contributions of IPython to my daily workflow.

When a technologically-minded person is asked to help a friend, family member, or colleague with a computer problem, most of the time it's less a matter of knowing the answer as much as knowing how to quickly find an unknown answer.
In data science it's the same: searchable web resources such as online documentation, mailing-list threads, and StackOverflow answers contain a wealth of information, even (especially?) if it is a topic you've found yourself searching before.
Being an effective practitioner of data science is less about memorizing the tool or command you should use for every possible situation, and more about learning to effectively find the information you don't know, whether through a web search engine or another means.

One of the most useful functions of IPython/Jupyter is to shorten the gap between the user and the type of documentation and search that will help them do their work effectively.
While web searches still play a role in answering complicated questions, an amazing amount of information can be found through IPython alone.
Some examples of the questions IPython can help answer in a few keystrokes:

- How do I call this function? What arguments and options does it have?
- What does the source code of this Python object look like?
- What is in this package I imported? What attributes or methods does this object have?

Here we'll discuss IPython's tools to quickly access this information, namely the ``?`` character to explore documentation, the ``??`` characters to explore source code, and the Tab key for auto-completion.

## Accessing Documentation with ``?``

The Python language and its data science ecosystem is built with the user in mind, and one big part of that is access to documentation.
Every Python object contains the reference to a string, known as a *doc string*, which in most cases will contain a concise summary of the object and how to use it.
Python has a built-in ``help()`` function that can access this information and prints the results.
For example, to see the documentation of the built-in ``len`` function, you can do the following:

```ipython
In [1]: help(len)
Help on built-in function len in module builtins:

len(...)
    len(object) -> integer
    
    Return the number of items of a sequence or mapping.
```

Depending on your interpreter, this information may be displayed as inline text, or in some separate pop-up window.

Because finding help on an object is so common and useful, IPython introduces the ``?`` character as a shorthand for accessing this documentation and other relevant information:

```ipython
In [2]: len?
Type:        builtin_function_or_method
String form: <built-in function len>
Namespace:   Python builtin
Docstring:
len(object) -> integer

Return the number of items of a sequence or mapping.
```

This notation works for just about anything, including object methods:

```ipython
In [3]: L = [1, 2, 3]
In [4]: L.insert?
Type:        builtin_function_or_method
String form: <built-in method insert of list object at 0x1024b8ea8>
Docstring:   L.insert(index, object) -- insert object before index
```

or even objects themselves, with the documentation from their type:

```ipython
In [5]: L?
Type:        list
String form: [1, 2, 3]
Length:      3
Docstring:
list() -> new empty list
list(iterable) -> new list initialized from iterable's items
```

Importantly, this will even work for functions or other objects you create yourself!
Here we'll define a small function with a docstring:

```ipython
In [6]: def square(a):
  ....:     """Return the square of a."""
  ....:     return a ** 2
  ....:
```

Note that to create a docstring for our function, we simply placed a string literal in the first line.
Because doc strings are usually multiple lines, by convention we used Python's triple-quote notation for multi-line strings.

Now we'll use the ``?`` mark to find this doc string:

```ipython
In [7]: square?
Type:        function
String form: <function square at 0x103713cb0>
Definition:  square(a)
Docstring:   Return the square of a.
```

This quick access to documentation via docstrings is one reason you should get in the habit of always adding such inline documentation to the code you write!

## Accessing Source Code with ``??``
Because the Python language is so easily readable, another level of insight can usually be gained by reading the source code of the object you're curious about.
IPython provides a shortcut to the source code with the double question mark (``??``):

```ipython
In [8]: square??
Type:        function
String form: <function square at 0x103713cb0>
Definition:  square(a)
Source:
def square(a):
    "Return the square of a"
    return a ** 2
```

For simple functions like this, the double question-mark can give quick insight into the under-the-hood details.

If you play with this much, you'll notice that sometimes the ``??`` suffix doesn't display any source code: this is generally because the object in question is not implemented in Python, but in C or some other compiled extension language.
If this is the case, the ``??`` suffix gives the same output as the ``?`` suffix.
You'll find this particularly with many of Python's built-in objects and types, for example ``len`` from above:

```ipython
In [9]: len??
Type:        builtin_function_or_method
String form: <built-in function len>
Namespace:   Python builtin
Docstring:
len(object) -> integer

Return the number of items of a sequence or mapping.
```

Using ``?`` and/or ``??`` gives a powerful and quick interface for finding information about what any Python function or module does.

## Exploring Modules with Tab-Completion

IPython's other useful interface is the use of the tab key for auto-completion and exploration of the contents of objects, modules, and name-spaces.
In the examples that follow, we'll use ``<TAB>`` to indicate when the Tab key should be pressed.

### Tab-completion of object contents

Every Python object has various attributes and methods associated with it.
Like with the ``help`` function discussed before, Python has a built-in ``dir`` function that returns a list of these, but the tab-completion interface is much easier to use in practice.
To see a list of all available attributes of an object, you can type the name of the object followed by a period ("``.``") character and the Tab key:

```ipython
In [10]: L.<TAB>
L.append   L.copy     L.extend   L.insert   L.remove   L.sort     
L.clear    L.count    L.index    L.pop      L.reverse  
```

To narrow-down the list, you can type the first character or several characters of the name, and the Tab key will find the matching attributes and methods:

```ipython
In [10]: L.c<TAB>
L.clear  L.copy   L.count  

In [10]: L.co<TAB>
L.copy   L.count 
```

If there is only a single option, pressing the Tab key will complete the line for you.
For example, the following will instantly be replaced with ``L.count``:

```ipython
In [10]: L.cou<TAB>

```

Though Python has no strictly-enforced distinction between public/external attributes and private/internal attributes, by convention a preceding underscore is used to denote such methods.
For clarity, these private methods and special methods are omitted from the list by default, but it's possible to list them by explicitly typing the underscore:

```ipython
In [10]: L._<TAB>
L.__add__           L.__gt__            L.__reduce__
L.__class__         L.__hash__          L.__reduce_ex__
```

For brevity, we've only shown the first couple lines of the output.
Most of these are Python's special double-underscore methods (often nicknamed "dunder" methods).

### Tab completion when importing

Tab completion is also useful when importing objects from packages.
Here we'll use it to find all possible imports in the ``itertools`` package that start with ``co``:
```
In [10]: from itertools import co<TAB>
combinations                   compress
combinations_with_replacement  count
```
Similarly, you can use tab-completion to see which imports are available on your system (this will change depending on which third-party scripts and modules are visible to your Python session):
```
In [10]: import <TAB>
Display all 399 possibilities? (y or n)
Crypto              dis                 py_compile
Cython              distutils           pyclbr
...                 ...                 ...
difflib             pwd                 zmq

In [10]: import h<TAB>
hashlib             hmac                http         
heapq               html                husl         
```
(Note that for brevity, I did not print here all 399 importable packages and modules on my system.)

### Beyond tab completion: wildcard matching

Tab completion is useful if you know the first few characters of the object or attribute you're looking for, but is little help if you'd like to match characters at the middle or end of the word.
For this use-case, IPython provides a means of wildcard matching for names using the ``*`` character.

For example, we can use this to list every object in the namespace that ends with ``Warning``:

```ipython
In [10]: *Warning?
BytesWarning                  RuntimeWarning
DeprecationWarning            SyntaxWarning
FutureWarning                 UnicodeWarning
ImportWarning                 UserWarning
PendingDeprecationWarning     Warning
ResourceWarning
```

Notice that the ``*`` character matches any string, including the empty string.

Similarly, suppose we are looking for a string method that contains the word ``find`` somewhere in its name.
We can search for it this way:

```ipython
In [10]: str.*find*?
str.find
str.rfind
```

I find this type of flexible wildcard search can be very useful for finding a particular command when getting to know a new package or reacquainting myself with a familiar one.

In [1]:
help(len)

Help on built-in function len in module builtins:

len(obj, /)
    Return the number of items in a container.



In [2]:
len?

[1;31mSignature:[0m [0mlen[0m[1;33m([0m[0mobj[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m Return the number of items in a container.
[1;31mType:[0m      builtin_function_or_method

In [3]:
L=[1, 2, 3]

In [4]:
L.insert?

[1;31mSignature:[0m [0mL[0m[1;33m.[0m[0minsert[0m[1;33m([0m[0mindex[0m[1;33m,[0m [0mobject[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m Insert object before index.
[1;31mType:[0m      builtin_function_or_method

In [5]:
L?

[1;31mType:[0m        list
[1;31mString form:[0m [1, 2, 3]
[1;31mLength:[0m      3
[1;31mDocstring:[0m  
Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list.
The argument must be an iterable if specified.

In [6]:
def square(a):
    """Return the square of a."""
    return a ** 2


In [7]:
square?

[1;31mSignature:[0m [0msquare[0m[1;33m([0m[0ma[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m Return the square of a.
[1;31mFile:[0m      c:\users\black\appdata\local\temp\ipykernel_5328\1318126685.py
[1;31mType:[0m      function

In [8]:
square??

[1;31mSignature:[0m [0msquare[0m[1;33m([0m[0ma[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mSource:[0m   
[1;32mdef[0m [0msquare[0m[1;33m([0m[0ma[0m[1;33m)[0m[1;33m:[0m[1;33m
[0m    [1;34m"""Return the square of a."""[0m[1;33m
[0m    [1;32mreturn[0m [0ma[0m [1;33m**[0m [1;36m2[0m[1;33m[0m[1;33m[0m[0m
[1;31mFile:[0m      c:\users\black\appdata\local\temp\ipykernel_5328\1318126685.py
[1;31mType:[0m      function

In [9]:
len??

[1;31mSignature:[0m [0mlen[0m[1;33m([0m[0mobj[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m Return the number of items in a container.
[1;31mType:[0m      builtin_function_or_method

In [10]:
str.*find*?

str.find
str.rfind

# Capítulo 01.02

# Keyboard Shortcuts in the IPython Shell

If you spend any amount of time on the computer, you've probably found a use for keyboard shortcuts in your workflow.
Most familiar perhaps are the Cmd-C and Cmd-V (or Ctrl-C and Ctrl-V) for copying and pasting in a wide variety of programs and systems.
Power-users tend to go even further: popular text editors like Emacs, Vim, and others provide users an incredible range of operations through intricate combinations of keystrokes.

The IPython shell doesn't go this far, but does provide a number of keyboard shortcuts for fast navigation while typing commands.
These shortcuts are not in fact provided by IPython itself, but through its dependency on the GNU Readline library: as such, some of the following shortcuts may differ depending on your system configuration.
Also, while some of these shortcuts do work in the browser-based notebook, this section is primarily about shortcuts in the IPython shell.

Once you get accustomed to these, they can be very useful for quickly performing certain commands without moving your hands from the "home" keyboard position.
If you're an Emacs user or if you have experience with Linux-style shells, the following will be very familiar.
We'll group these shortcuts into a few categories: *navigation shortcuts*, *text entry shortcuts*, *command history shortcuts*, and *miscellaneous shortcuts*.

## Navigation shortcuts

While the use of the left and right arrow keys to move backward and forward in the line is quite obvious, there are other options that don't require moving your hands from the "home" keyboard position:

| Keystroke                         | Action                                     |
|-----------------------------------|--------------------------------------------|
| ``Ctrl-a``                        | Move cursor to the beginning of the line   |
| ``Ctrl-e``                        | Move cursor to the end of the line         |
| ``Ctrl-b`` or the left arrow key  | Move cursor back one character             |
| ``Ctrl-f`` or the right arrow key | Move cursor forward one character          |

## Text Entry Shortcuts

While everyone is familiar with using the Backspace key to delete the previous character, reaching for the key often requires some minor finger gymnastics, and it only deletes a single character at a time.
In IPython there are several shortcuts for removing some portion of the text you're typing.
The most immediately useful of these are the commands to delete entire lines of text.
You'll know these have become second-nature if you find yourself using a combination of Ctrl-b and Ctrl-d instead of reaching for Backspace to delete the previous character!

| Keystroke                     | Action                                           |
|-------------------------------|--------------------------------------------------|
| Backspace key                 | Delete previous character in line                |
| ``Ctrl-d``                    | Delete next character in line                    |
| ``Ctrl-k``                    | Cut text from cursor to end of line              |
| ``Ctrl-u``                    | Cut text from beginning of line to cursor        |
| ``Ctrl-y``                    | Yank (i.e. paste) text that was previously cut   |
| ``Ctrl-t``                    | Transpose (i.e., switch) previous two characters |

## Command History Shortcuts

Perhaps the most impactful shortcuts discussed here are the ones IPython provides for navigating the command history.
This command history goes beyond your current IPython session: your entire command history is stored in a SQLite database in your IPython profile directory.
The most straightforward way to access these is with the up and down arrow keys to step through the history, but other options exist as well:

| Keystroke                           | Action                                     |
|-------------------------------------|--------------------------------------------|
| ``Ctrl-p`` (or the up arrow key)    | Access previous command in history         |
| ``Ctrl-n`` (or the down arrow key)  | Access next command in history             |
| ``Ctrl-r``                          | Reverse-search through command history     |

The reverse-search can be particularly useful.
Recall that in the previous section we defined a function called ``square``.
Let's reverse-search our Python history from a new IPython shell and find this definition again.
When you press Ctrl-r in the IPython terminal, you'll see the following prompt:

```ipython
In [1]:
(reverse-i-search)`': 
```

If you start typing characters at this prompt, IPython will auto-fill the most recent command, if any, that matches those characters:

```ipython
In [1]: 
(reverse-i-search)`sqa': square??
```

At any point, you can add more characters to refine the search, or press Ctrl-r again to search further for another command that matches the query. If you followed along in the previous section, pressing Ctrl-r twice more gives:

```ipython
In [1]: 
(reverse-i-search)`sqa': def square(a):
    """Return the square of a"""
    return a ** 2
```

Once you have found the command you're looking for, press Return and the search will end.
We can then use the retrieved command, and carry-on with our session:

```ipython
In [1]: def square(a):
    """Return the square of a"""
    return a ** 2

In [2]: square(2)
Out[2]: 4
```

Note that Ctrl-p/Ctrl-n or the up/down arrow keys can also be used to search through history, but only by matching characters at the beginning of the line.
That is, if you type **``def``** and then press Ctrl-p, it would find the most recent command (if any) in your history that begins with the characters ``def``.

## Miscellaneous Shortcuts

Finally, there are a few miscellaneous shortcuts that don't fit into any of the preceding categories, but are nevertheless useful to know:

| Keystroke                     | Action                                     |
|-------------------------------|--------------------------------------------|
| ``Ctrl-l``                    | Clear terminal screen                      |
| ``Ctrl-c``                    | Interrupt current Python command           |
| ``Ctrl-d``                    | Exit IPython session                       |

The Ctrl-c in particular can be useful when you inadvertently start a very long-running job.

While some of the shortcuts discussed here may seem a bit tedious at first, they quickly become automatic with practice.
Once you develop that muscle memory, I suspect you will even find yourself wishing they were available in other contexts.

In [11]:
square(2)

4

# Capítulo 01.03

# IPython Magic Commands

The previous two sections showed how IPython lets you use and explore Python efficiently and interactively.
Here we'll begin discussing some of the enhancements that IPython adds on top of the normal Python syntax.
These are known in IPython as *magic commands*, and are prefixed by the ``%`` character.
These magic commands are designed to succinctly solve various common problems in standard data analysis.
Magic commands come in two flavors: *line magics*, which are denoted by a single ``%`` prefix and operate on a single line of input, and *cell magics*, which are denoted by a double ``%%`` prefix and operate on multiple lines of input.
We'll demonstrate and discuss a few brief examples here, and come back to more focused discussion of several useful magic commands later in the chapter.

## Pasting Code Blocks: ``%paste`` and ``%cpaste``

When working in the IPython interpreter, one common gotcha is that pasting multi-line code blocks can lead to unexpected errors, especially when indentation and interpreter markers are involved.
A common case is that you find some example code on a website and want to paste it into your interpreter.
Consider the following simple function:

``` python
>>> def donothing(x):
...     return x

```
The code is formatted as it would appear in the Python interpreter, and if you copy and paste this directly into IPython you get an error:

```ipython
In [2]: >>> def donothing(x):
   ...:     ...     return x
   ...:     
  File "<ipython-input-20-5a66c8964687>", line 2
    ...     return x
                 ^
SyntaxError: invalid syntax
```

In the direct paste, the interpreter is confused by the additional prompt characters.
But never fear–IPython's ``%paste`` magic function is designed to handle this exact type of multi-line, marked-up input:

```ipython
In [3]: %paste
>>> def donothing(x):
...     return x

## -- End pasted text --
```

The ``%paste`` command both enters and executes the code, so now the function is ready to be used:

```ipython
In [4]: donothing(10)
Out[4]: 10
```

A command with a similar intent is ``%cpaste``, which opens up an interactive multiline prompt in which you can paste one or more chunks of code to be executed in a batch:

```ipython
In [5]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:>>> def donothing(x):
:...     return x
:--
```

These magic commands, like others we'll see, make available functionality that would be difficult or impossible in a standard Python interpreter.

## Running External Code: ``%run``
As you begin developing more extensive code, you will likely find yourself working in both IPython for interactive exploration, as well as a text editor to store code that you want to reuse.
Rather than running this code in a new window, it can be convenient to run it within your IPython session.
This can be done with the ``%run`` magic.

For example, imagine you've created a ``myscript.py`` file with the following contents:

```python
#-------------------------------------
# file: myscript.py

def square(x):
    """square a number"""
    return x ** 2

for N in range(1, 4):
    print(N, "squared is", square(N))
```

You can execute this from your IPython session as follows:

```ipython
In [6]: %run myscript.py
1 squared is 1
2 squared is 4
3 squared is 9
```

Note also that after you've run this script, any functions defined within it are available for use in your IPython session:

```ipython
In [7]: square(5)
Out[7]: 25
```

There are several options to fine-tune how your code is run; you can see the documentation in the normal way, by typing **``%run?``** in the IPython interpreter.

## Timing Code Execution: ``%timeit``
Another example of a useful magic function is ``%timeit``, which will automatically determine the execution time of the single-line Python statement that follows it.
For example, we may want to check the performance of a list comprehension:

```ipython
In [8]: %timeit L = [n ** 2 for n in range(1000)]
1000 loops, best of 3: 325 µs per loop
```

The benefit of ``%timeit`` is that for short commands it will automatically perform multiple runs in order to attain more robust results.
For multi line statements, adding a second ``%`` sign will turn this into a cell magic that can handle multiple lines of input.
For example, here's the equivalent construction with a ``for``-loop:

```ipython
In [9]: %%timeit
   ...: L = []
   ...: for n in range(1000):
   ...:     L.append(n ** 2)
   ...: 
1000 loops, best of 3: 373 µs per loop
```

We can immediately see that list comprehensions are about 10% faster than the equivalent ``for``-loop construction in this case.
We'll explore ``%timeit`` and other approaches to timing and profiling code in [Profiling and Timing Code](01.07-Timing-and-Profiling.ipynb).

## Help on Magic Functions: ``?``, ``%magic``, and ``%lsmagic``

Like normal Python functions, IPython magic functions have docstrings, and this useful
documentation can be accessed in the standard manner.
So, for example, to read the documentation of the ``%timeit`` magic simply type this:

```ipython
In [10]: %timeit?
```

Documentation for other functions can be accessed similarly.
To access a general description of available magic functions, including some examples, you can type this:

```ipython
In [11]: %magic
```

For a quick and simple list of all available magic functions, type this:

```ipython
In [12]: %lsmagic
```

Finally, I'll mention that it is quite straightforward to define your own magic functions if you wish.
We won't discuss it here, but if you are interested, see the references listed in [More IPython Resources](01.08-More-IPython-Resources.ipynb).

In [12]:
%run myscript.py

1 squared is 1
2 squared is 4
3 squared is 9


In [13]:
square(5)

25

In [14]:
%run?

[1;31mDocstring:[0m
Run the named file inside IPython as a program.

Usage::

  %run [-n -i -e -G]
       [( -t [-N<N>] | -d [-b<N>] | -p [profile options] )]
       ( -m mod | filename ) [args]

The filename argument should be either a pure Python script (with
extension ``.py``), or a file with custom IPython syntax (such as
magics). If the latter, the file can be either a script with ``.ipy``
extension, or a Jupyter notebook with ``.ipynb`` extension. When running
a Jupyter notebook, the output from print statements and other
displayed objects will appear in the terminal (even matplotlib figures
will open, if a terminal-compliant backend is being used). Note that,
at the system command line, the ``jupyter run`` command offers similar
functionality for executing notebooks (albeit currently with some
differences in supported options).

Parameters after the filename are passed as command-line arguments to
the program (put in sys.argv). Then, control returns to IPython's
prompt.

This 

In [15]:
%timeit L= [n ** 2 for n in range(1000)]

82.1 μs ± 6.26 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


In [16]:
%timeit?

[1;31mDocstring:[0m
Time execution of a Python statement or expression

Usage, in line mode:
  %timeit [-n<N> -r<R> [-t|-c] -q -p<P> -o] statement
or in cell mode:
  %%timeit [-n<N> -r<R> [-t|-c] -q -p<P> -o] setup_code
  code
  code...

Time execution of a Python statement or expression using the timeit
module.  This function can be used both as a line and cell magic:

- In line mode you can time a single-line statement (though multiple
  ones can be chained with using semicolons).

- In cell mode, the statement in the first line is used as setup code
  (executed but not timed) and the body of the cell is timed.  The cell
  body has access to any variables created in the setup code.

Options:
-n<N>: execute the given statement <N> times in a loop. If <N> is not
provided, <N> is determined so as to get sufficient accuracy.

-r<R>: number of repeats <R>, each consisting of <N> loops, and take the
average result.
Default: 7

-t: use time.time to measure the time, which is the default o

In [17]:
%magic


IPython's 'magic' functions

The magic function system provides a series of functions which allow you to
control the behavior of IPython itself, plus a lot of system-type
features. There are two kinds of magics, line-oriented and cell-oriented.

Line magics are prefixed with the % character and work much like OS
command-line calls: they get as an argument the rest of the line, where
arguments are passed without parentheses or quotes.  For example, this will
time the given statement::

        %timeit range(1000)

Cell magics are prefixed with a double %%, and they are functions that get as
an argument not only the rest of the line, but also the lines below it in a
separate argument.  These magics are called with two arguments: the rest of the
call line and the body of the cell, consisting of the lines below the first.
For example::

        %%timeit x = numpy.random.randn((100, 100))
        numpy.linalg.svd(x)

will time the execution of the numpy svd routine, running the assignment 

In [18]:
%lsmagic

Available line magics:
%alias  %alias_magic  %autoawait  %autocall  %automagic  %autosave  %bookmark  %cd  %clear  %cls  %code_wrap  %colors  %conda  %config  %connect_info  %copy  %ddir  %debug  %dhist  %dirs  %doctest_mode  %echo  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %macro  %magic  %mamba  %matplotlib  %micromamba  %mkdir  %more  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %pip  %popd  %pprint  %precision  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %ren  %rep  %rerun  %reset  %reset_selective  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%cmd  %%code_wrap  %%debug  %%file  %%html  %%javascript  %%js  %%latex 

# Capítulo 01.04

# Input and Output History

Previously we saw that the IPython shell allows you to access previous commands with the up and down arrow keys, or equivalently the Ctrl-p/Ctrl-n shortcuts.
Additionally, in both the shell and the notebook, IPython exposes several ways to obtain the output of previous commands, as well as string versions of the commands themselves.
We'll explore those here.

## IPython's ``In`` and ``Out`` Objects

By now I imagine you're quite familiar with the ``In [1]:``/``Out[1]:`` style prompts used by IPython.
But it turns out that these are not just pretty decoration: they give a clue as to how you can access previous inputs and outputs in your current session.
Imagine you start a session that looks like this:

```ipython
In [1]: import math

In [2]: math.sin(2)
Out[2]: 0.9092974268256817

In [3]: math.cos(2)
Out[3]: -0.4161468365471424
```

We've imported the built-in ``math`` package, then computed the sine and the cosine of the number 2.
These inputs and outputs are displayed in the shell with ``In``/``Out`` labels, but there's more–IPython actually creates some Python variables called ``In`` and ``Out`` that are automatically updated to reflect this history:

```ipython
In [4]: print(In)
['', 'import math', 'math.sin(2)', 'math.cos(2)', 'print(In)']

In [5]: Out
Out[5]: {2: 0.9092974268256817, 3: -0.4161468365471424}
```

The ``In`` object is a list, which keeps track of the commands in order (the first item in the list is a place-holder so that ``In[1]`` can refer to the first command):

```ipython
In [6]: print(In[1])
import math
```

The ``Out`` object is not a list but a dictionary mapping input numbers to their outputs (if any):

```ipython
In [7]: print(Out[2])
0.9092974268256817
```

Note that not all operations have outputs: for example, ``import`` statements and ``print`` statements don't affect the output.
The latter may be surprising, but makes sense if you consider that ``print`` is a function that returns ``None``; for brevity, any command that returns ``None`` is not added to ``Out``.

Where this can be useful is if you want to interact with past results.
For example, let's check the sum of ``sin(2) ** 2`` and ``cos(2) ** 2`` using the previously-computed results:

```ipython
In [8]: Out[2] ** 2 + Out[3] ** 2
Out[8]: 1.0
```

The result is ``1.0`` as we'd expect from the well-known trigonometric identity.
In this case, using these previous results probably is not necessary, but it can become very handy if you execute a very expensive computation and want to reuse the result!

## Underscore Shortcuts and Previous Outputs

The standard Python shell contains just one simple shortcut for accessing previous output; the variable ``_`` (i.e., a single underscore) is kept updated with the previous output; this works in IPython as well:

```ipython
In [9]: print(_)
1.0
```

But IPython takes this a bit further—you can use a double underscore to access the second-to-last output, and a triple underscore to access the third-to-last output (skipping any commands with no output):

```ipython
In [10]: print(__)
-0.4161468365471424

In [11]: print(___)
0.9092974268256817
```

IPython stops there: more than three underscores starts to get a bit hard to count, and at that point it's easier to refer to the output by line number.

There is one more shortcut we should mention, however–a shorthand for ``Out[X]`` is ``_X`` (i.e., a single underscore followed by the line number):

```ipython
In [12]: Out[2]
Out[12]: 0.9092974268256817

In [13]: _2
Out[13]: 0.9092974268256817
```

## Suppressing Output
Sometimes you might wish to suppress the output of a statement (this is perhaps most common with the plotting commands that we'll explore in [Introduction to Matplotlib](04.00-Introduction-To-Matplotlib.ipynb)).
Or maybe the command you're executing produces a result that you'd prefer not like to store in your output history, perhaps so that it can be deallocated when other references are removed.
The easiest way to suppress the output of a command is to add a semicolon to the end of the line:

```ipython
In [14]: math.sin(2) + math.cos(2);
```

Note that the result is computed silently, and the output is neither displayed on the screen or stored in the ``Out`` dictionary:

```ipython
In [15]: 14 in Out
Out[15]: False
```

## Related Magic Commands
For accessing a batch of previous inputs at once, the ``%history`` magic command is very helpful.
Here is how you can print the first four inputs:

```ipython
In [16]: %history -n 1-4
   1: import math
   2: math.sin(2)
   3: math.cos(2)
   4: print(In)
```

As usual, you can type ``%history?`` for more information and a description of options available.
Other similar magic commands are ``%rerun`` (which will re-execute some portion of the command history) and ``%save`` (which saves some set of the command history to a file).
For more information, I suggest exploring these using the ``?`` help functionality discussed in [Help and Documentation in IPython](01.01-Help-And-Documentation.ipynb).

In [19]:
import math

In [20]:
math.sin(2)

0.9092974268256817

In [21]:
math.cos(2)

-0.4161468365471424

In [22]:
print(In)

['', 'help(len)', "get_ipython().run_line_magic('pinfo', 'len')", 'L=[1, 2, 3]', "get_ipython().run_line_magic('pinfo', 'L.insert')", "get_ipython().run_line_magic('pinfo', 'L')", 'def square(a):\n    """Return the square of a."""\n    return a ** 2', "get_ipython().run_line_magic('pinfo', 'square')", "get_ipython().run_line_magic('pinfo2', 'square')", "get_ipython().run_line_magic('pinfo2', 'len')", "get_ipython().run_line_magic('psearch', 'str.*find*')", 'square(2)', "get_ipython().run_line_magic('run', 'myscript.py')", 'square(5)', "get_ipython().run_line_magic('pinfo', '%run')", "get_ipython().run_line_magic('timeit', 'L= [n ** 2 for n in range(1000)]')", "get_ipython().run_line_magic('pinfo', '%timeit')", "get_ipython().run_line_magic('magic', '')", "get_ipython().run_line_magic('lsmagic', '')", 'import math', 'math.sin(2)', 'math.cos(2)', 'print(In)']


In [23]:
Out

{11: 4,
 13: 25,
 18: Available line magics:
%alias  %alias_magic  %autoawait  %autocall  %automagic  %autosave  %bookmark  %cd  %clear  %cls  %code_wrap  %colors  %conda  %config  %connect_info  %copy  %ddir  %debug  %dhist  %dirs  %doctest_mode  %echo  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %macro  %magic  %mamba  %matplotlib  %micromamba  %mkdir  %more  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %pip  %popd  %pprint  %precision  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %ren  %rep  %rerun  %reset  %reset_selective  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%cmd  %%code_wrap  %%debug  %%file  %%html  %%java

In [24]:
print(In[1])

help(len)


In [25]:
print(Out[11])

4


In [26]:
Out[20]**2 + Out[21]**2

1.0

In [27]:
print(_)

1.0


In [28]:
print(__)

-0.4161468365471424


In [29]:
print(___)

0.9092974268256817


In [30]:
Out[21]

-0.4161468365471424

In [31]:
_21

-0.4161468365471424

In [32]:
math.sin(2) + math.cos(2);

In [33]:
35 in Out

False

In [34]:
%history -n 1-4

   1: help(len)
   2: len?
   3: L=[1, 2, 3]
   4: L.insert?


# Capítulo 01.05

# IPython and Shell Commands

When working interactively with the standard Python interpreter, one of the frustrations is the need to switch between multiple windows to access Python tools and system command-line tools.
IPython bridges this gap, and gives you a syntax for executing shell commands directly from within the IPython terminal.
The magic happens with the exclamation point: anything appearing after ``!`` on a line will be executed not by the Python kernel, but by the system command-line.

The following assumes you're on a Unix-like system, such as Linux or Mac OSX.
Some of the examples that follow will fail on Windows, which uses a different type of shell by default (though with the 2016 announcement of native Bash shells on Windows, soon this may no longer be an issue!).
If you're unfamiliar with shell commands, I'd suggest reviewing the [Shell Tutorial](http://swcarpentry.github.io/shell-novice/) put together by the always excellent Software Carpentry Foundation.

## Quick Introduction to the Shell

A full intro to using the shell/terminal/command-line is well beyond the scope of this chapter, but for the uninitiated we will offer a quick introduction here.
The shell is a way to interact textually with your computer.
Ever since the mid 1980s, when Microsoft and Apple introduced the first versions of their now ubiquitous graphical operating systems, most computer users have interacted with their operating system through familiar clicking of menus and drag-and-drop movements.
But operating systems existed long before these graphical user interfaces, and were primarily controlled through sequences of text input: at the prompt, the user would type a command, and the computer would do what the user told it to.
Those early prompt systems are the precursors of the shells and terminals that most active data scientists still use today.

Someone unfamiliar with the shell might ask why you would bother with this, when many results can be accomplished by simply clicking on icons and menus.
A shell user might reply with another question: why hunt icons and click menus when you can accomplish things much more easily by typing?
While it might sound like a typical tech preference impasse, when moving beyond basic tasks it quickly becomes clear that the shell offers much more control of advanced tasks, though admittedly the learning curve can intimidate the average computer user.

As an example, here is a sample of a Linux/OSX shell session where a user explores, creates, and modifies directories and files on their system (``osx:~ $`` is the prompt, and everything after the ``$`` sign is the typed command; text that is preceded by a ``#`` is meant just as description, rather than something you would actually type in):

```bash
osx:~ $ echo "hello world"             # echo is like Python's print function
hello world

osx:~ $ pwd                            # pwd = print working directory
/home/jake                             # this is the "path" that we're sitting in

osx:~ $ ls                             # ls = list working directory contents
notebooks  projects 

osx:~ $ cd projects/                   # cd = change directory

osx:projects $ pwd
/home/jake/projects

osx:projects $ ls
datasci_book   mpld3   myproject.txt

osx:projects $ mkdir myproject          # mkdir = make new directory

osx:projects $ cd myproject/

osx:myproject $ mv ../myproject.txt ./  # mv = move file. Here we're moving the
                                        # file myproject.txt from one directory
                                        # up (../) to the current directory (./)
osx:myproject $ ls
myproject.txt
```

Notice that all of this is just a compact way to do familiar operations (navigating a directory structure, creating a directory, moving a file, etc.) by typing commands rather than clicking icons and menus.
Note that with just a few commands (``pwd``, ``ls``, ``cd``, ``mkdir``, and ``cp``) you can do many of the most common file operations.
It's when you go beyond these basics that the shell approach becomes really powerful.

## Shell Commands in IPython

Any command that works at the command-line can be used in IPython by prefixing it with the ``!`` character.
For example, the ``ls``, ``pwd``, and ``echo`` commands can be run as follows:

```ipython
In [1]: !ls
myproject.txt

In [2]: !pwd
/home/jake/projects/myproject

In [3]: !echo "printing from the shell"
printing from the shell
```

## Passing Values to and from the Shell

Shell commands can not only be called from IPython, but can also be made to interact with the IPython namespace.
For example, you can save the output of any shell command to a Python list using the assignment operator:

```ipython
In [4]: contents = !ls

In [5]: print(contents)
['myproject.txt']

In [6]: directory = !pwd

In [7]: print(directory)
['/Users/jakevdp/notebooks/tmp/myproject']
```

Note that these results are not returned as lists, but as a special shell return type defined in IPython:

```ipython
In [8]: type(directory)
IPython.utils.text.SList
```

This looks and acts a lot like a Python list, but has additional functionality, such as
the ``grep`` and ``fields`` methods and the ``s``, ``n``, and ``p`` properties that allow you to search, filter, and display the results in convenient ways.
For more information on these, you can use IPython's built-in help features.

Communication in the other direction–passing Python variables into the shell–is possible using the ``{varname}`` syntax:

```ipython
In [9]: message = "hello from Python"

In [10]: !echo {message}
hello from Python
```

The curly braces contain the variable name, which is replaced by the variable's contents in the shell command.

# Shell-Related Magic Commands

If you play with IPython's shell commands for a while, you might notice that you cannot use ``!cd`` to navigate the filesystem:

```ipython
In [11]: !pwd
/home/jake/projects/myproject

In [12]: !cd ..

In [13]: !pwd
/home/jake/projects/myproject
```

The reason is that shell commands in the notebook are executed in a temporary subshell.
If you'd like to change the working directory in a more enduring way, you can use the ``%cd`` magic command:

```ipython
In [14]: %cd ..
/home/jake/projects
```

In fact, by default you can even use this without the ``%`` sign:

```ipython
In [15]: cd myproject
/home/jake/projects/myproject
```

This is known as an ``automagic`` function, and this behavior can be toggled with the ``%automagic`` magic function.

Besides ``%cd``, other available shell-like magic functions are ``%cat``, ``%cp``, ``%env``, ``%ls``, ``%man``, ``%mkdir``, ``%more``, ``%mv``, ``%pwd``, ``%rm``, and ``%rmdir``, any of which can be used without the ``%`` sign if ``automagic`` is on.
This makes it so that you can almost treat the IPython prompt as if it's a normal shell:

```ipython
In [16]: mkdir tmp

In [17]: ls
myproject.txt  tmp/

In [18]: cp myproject.txt tmp/

In [19]: ls tmp
myproject.txt

In [20]: rm -r tmp
```

This access to the shell from within the same terminal window as your Python session means that there is a lot less switching back and forth between interpreter and shell as you write your Python code.

In [35]:
!dir

 El volumen de la unidad C es Windows-SSD
 El n£mero de serie del volumen es: 4E34-E59D

 Directorio de C:\Users\Black\Documents\Jupyter\icd2024 Libro

01/09/2024  07:33 p. m.    <DIR>          .
01/09/2024  07:33 p. m.    <DIR>          ..
01/09/2024  02:48 p. m.    <DIR>          .ipynb_checkpoints
05/05/2023  04:20 p. m.            13,987 00.00-Preface.ipynb
01/09/2024  12:26 p. m.             8,523 01.00-IPython-Beyond-Normal-Python.ipynb
01/09/2024  12:35 p. m.            15,461 01.01-Help-And-Documentation.ipynb
05/05/2023  04:20 p. m.            10,620 01.02-Shell-Keyboard-Shortcuts.ipynb
01/09/2024  03:53 p. m.             9,950 01.03-Magic-Commands.ipynb
01/09/2024  04:58 p. m.             9,144 01.04-Input-Output-History.ipynb
05/05/2023  04:20 p. m.            11,580 01.05-IPython-And-Shell-Commands.ipynb
01/09/2024  02:25 p. m.            21,741 01.06-Errors-and-Debugging.ipynb
01/09/2024  02:25 p. m.            18,957 01.07-Timing-and-Profiling.ipynb
01/09/2024  02:25 p. m

In [36]:
!cd

C:\Users\Black\Documents\Jupyter\icd2024 Libro


In [37]:
!echo "printing from the shell"

"printing from the shell"


In [38]:
Contents = !dir

In [39]:
print(Contents)

[' El volumen de la unidad C es Windows-SSD', ' El n£mero de serie del volumen es: 4E34-E59D', '', ' Directorio de C:\\Users\\Black\\Documents\\Jupyter\\icd2024 Libro', '', '01/09/2024  07:33 p. m.    <DIR>          .', '01/09/2024  07:33 p. m.    <DIR>          ..', '01/09/2024  02:48 p. m.    <DIR>          .ipynb_checkpoints', '05/05/2023  04:20 p. m.            13,987 00.00-Preface.ipynb', '01/09/2024  12:26 p. m.             8,523 01.00-IPython-Beyond-Normal-Python.ipynb', '01/09/2024  12:35 p. m.            15,461 01.01-Help-And-Documentation.ipynb', '05/05/2023  04:20 p. m.            10,620 01.02-Shell-Keyboard-Shortcuts.ipynb', '01/09/2024  03:53 p. m.             9,950 01.03-Magic-Commands.ipynb', '01/09/2024  04:58 p. m.             9,144 01.04-Input-Output-History.ipynb', '05/05/2023  04:20 p. m.            11,580 01.05-IPython-And-Shell-Commands.ipynb', '01/09/2024  02:25 p. m.            21,741 01.06-Errors-and-Debugging.ipynb', '01/09/2024  02:25 p. m.            18,957 

In [58]:
Directory = !cd

In [41]:
print(Directory)

['C:\\Users\\Black\\Documents\\Jupyter\\icd2024 Libro']


In [42]:
type(Directory)

IPython.utils.text.SList

In [43]:
message = "Hello from Python"

In [44]:
!echo {message}

Hello from Python


In [45]:
!cd

C:\Users\Black\Documents\Jupyter\icd2024 Libro


In [46]:
!cd ..

In [47]:
!cd

C:\Users\Black\Documents\Jupyter\icd2024 Libro


In [48]:
%cd ..

C:\Users\Black\Documents\Jupyter


  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


In [49]:
cd icd2024 Libro

C:\Users\Black\Documents\Jupyter\icd2024 Libro


In [50]:
mkdir tmp

In [51]:
ls

 El volumen de la unidad C es Windows-SSD
 El n£mero de serie del volumen es: 4E34-E59D

 Directorio de C:\Users\Black\Documents\Jupyter\icd2024 Libro

01/09/2024  07:33 p. m.    <DIR>          .
01/09/2024  07:33 p. m.    <DIR>          ..
01/09/2024  02:48 p. m.    <DIR>          .ipynb_checkpoints
05/05/2023  04:20 p. m.            13,987 00.00-Preface.ipynb
01/09/2024  12:26 p. m.             8,523 01.00-IPython-Beyond-Normal-Python.ipynb
01/09/2024  12:35 p. m.            15,461 01.01-Help-And-Documentation.ipynb
05/05/2023  04:20 p. m.            10,620 01.02-Shell-Keyboard-Shortcuts.ipynb
01/09/2024  03:53 p. m.             9,950 01.03-Magic-Commands.ipynb
01/09/2024  04:58 p. m.             9,144 01.04-Input-Output-History.ipynb
05/05/2023  04:20 p. m.            11,580 01.05-IPython-And-Shell-Commands.ipynb
01/09/2024  02:25 p. m.            21,741 01.06-Errors-and-Debugging.ipynb
01/09/2024  02:25 p. m.            18,957 01.07-Timing-and-Profiling.ipynb
01/09/2024  02:25 p. m

In [52]:
copy Index.ipynb tmp

        1 archivo(s) copiado(s).


In [53]:
ls tmp

 El volumen de la unidad C es Windows-SSD
 El n£mero de serie del volumen es: 4E34-E59D

 Directorio de C:\Users\Black\Documents\Jupyter\icd2024 Libro\tmp

01/09/2024  07:33 p. m.    <DIR>          .
01/09/2024  07:33 p. m.    <DIR>          ..
01/09/2024  01:35 p. m.             6,720 Index.ipynb
               1 archivos          6,720 bytes
               2 dirs  35,947,872,256 bytes libres


In [54]:
!del /q tmp

In [55]:
!rmdir tmp

# Capítulo 01.06

# Errors and Debugging

Code development and data analysis always require a bit of trial and error, and IPython contains tools to streamline this process.
This section will briefly cover some options for controlling Python's exception reporting, followed by exploring tools for debugging errors in code.

## Controlling Exceptions: ``%xmode``

Most of the time when a Python script fails, it will raise an Exception.
When the interpreter hits one of these exceptions, information about the cause of the error can be found in the *traceback*, which can be accessed from within Python.
With the ``%xmode`` magic function, IPython allows you to control the amount of information printed when the exception is raised.
Consider the following code:

In [56]:
def func1(a, b):
    return a / b

def func2(x):
    a = x
    b = x - 1
    return func1(a, b)

In [57]:
func2(1)

ZeroDivisionError: division by zero

Calling ``func2`` results in an error, and reading the printed trace lets us see exactly what happened.
By default, this trace includes several lines showing the context of each step that led to the error.
Using the ``%xmode`` magic function (short for *Exception mode*), we can change what information is printed.

``%xmode`` takes a single argument, the mode, and there are three possibilities: ``Plain``, ``Context``, and ``Verbose``.
The default is ``Context``, and gives output like that just shown before.
``Plain`` is more compact and gives less information:

In [59]:
%xmode Plain

Exception reporting mode: Plain


In [60]:
func2(1)

ZeroDivisionError: division by zero

The ``Verbose`` mode adds some extra information, including the arguments to any functions that are called:

In [61]:
%xmode Verbose

Exception reporting mode: Verbose


In [62]:
func2(1)

ZeroDivisionError: division by zero

This extra information can help narrow-in on why the exception is being raised.
So why not use the ``Verbose`` mode all the time?
As code gets complicated, this kind of traceback can get extremely long.
Depending on the context, sometimes the brevity of ``Default`` mode is easier to work with.

## Debugging: When Reading Tracebacks Is Not Enough

The standard Python tool for interactive debugging is ``pdb``, the Python debugger.
This debugger lets the user step through the code line by line in order to see what might be causing a more difficult error.
The IPython-enhanced version of this is ``ipdb``, the IPython debugger.

There are many ways to launch and use both these debuggers; we won't cover them fully here.
Refer to the online documentation of these two utilities to learn more.

In IPython, perhaps the most convenient interface to debugging is the ``%debug`` magic command.
If you call it after hitting an exception, it will automatically open an interactive debugging prompt at the point of the exception.
The ``ipdb`` prompt lets you explore the current state of the stack, explore the available variables, and even run Python commands!

Let's look at the most recent exception, then do some basic tasks–print the values of ``a`` and ``b``, and type ``quit`` to quit the debugging session:

In [63]:
%debug

> [1;32mc:\users\black\appdata\local\temp\ipykernel_5328\4021589855.py[0m(2)[0;36mfunc1[1;34m()[0m



ipdb>  print(a)


1


ipdb>  print(b)


0


ipdb>  quit


The interactive debugger allows much more than this, though–we can even step up and down through the stack and explore the values of variables there:

In [64]:
%debug

> [1;32mc:\users\black\appdata\local\temp\ipykernel_5328\4021589855.py[0m(2)[0;36mfunc1[1;34m()[0m



ipdb>  up


> [1;32mc:\users\black\appdata\local\temp\ipykernel_5328\4021589855.py[0m(7)[0;36mfunc2[1;34m()[0m



ipdb>  print(x)


1


ipdb>  up


> [1;32mc:\users\black\appdata\local\temp\ipykernel_5328\2483606204.py[0m(1)[0;36m<module>[1;34m()[0m



ipdb>  down


> [1;32mc:\users\black\appdata\local\temp\ipykernel_5328\4021589855.py[0m(7)[0;36mfunc2[1;34m()[0m



ipdb>  quit


This allows you to quickly find out not only what caused the error, but what function calls led up to the error.

If you'd like the debugger to launch automatically whenever an exception is raised, you can use the ``%pdb`` magic function to turn on this automatic behavior:

In [65]:
%xmode Plain
%pdb on
func2(1)

Exception reporting mode: Plain
Automatic pdb calling has been turned ON


ZeroDivisionError: division by zero

> [1;32mc:\users\black\appdata\local\temp\ipykernel_5328\4021589855.py[0m(2)[0;36mfunc1[1;34m()[0m



ipdb>  print(b)


0


ipdb>  quit


Finally, if you have a script that you'd like to run from the beginning in interactive mode, you can run it with the command ``%run -d``, and use the ``next`` command to step through the lines of code interactively.

### Partial list of debugging commands

There are many more available commands for interactive debugging than we've listed here; the following table contains a description of some of the more common and useful ones:

| Command         |  Description                                                |
|-----------------|-------------------------------------------------------------|
| ``list``        | Show the current location in the file                       |
| ``h(elp)``      | Show a list of commands, or find help on a specific command |
| ``q(uit)``      | Quit the debugger and the program                           |
| ``c(ontinue)``  | Quit the debugger, continue in the program                  |
| ``n(ext)``      | Go to the next step of the program                          |
| ``<enter>``     | Repeat the previous command                                 |
| ``p(rint)``     | Print variables                                             |
| ``s(tep)``      | Step into a subroutine                                      |
| ``r(eturn)``    | Return out of a subroutine                                  |

For more information, use the ``help`` command in the debugger, or take a look at ``ipdb``'s [online documentation](https://github.com/gotcha/ipdb).

# Capítulo 01.07

# Profiling and Timing Code

In the process of developing code and creating data processing pipelines, there are often trade-offs you can make between various implementations.
Early in developing your algorithm, it can be counterproductive to worry about such things. As Donald Knuth famously quipped, "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil."

But once you have your code working, it can be useful to dig into its efficiency a bit.
Sometimes it's useful to check the execution time of a given command or set of commands; other times it's useful to dig into a multiline process and determine where the bottleneck lies in some complicated series of operations.
IPython provides access to a wide array of functionality for this kind of timing and profiling of code.
Here we'll discuss the following IPython magic commands:

- ``%time``: Time the execution of a single statement
- ``%timeit``: Time repeated execution of a single statement for more accuracy
- ``%prun``: Run code with the profiler
- ``%lprun``: Run code with the line-by-line profiler
- ``%memit``: Measure the memory use of a single statement
- ``%mprun``: Run code with the line-by-line memory profiler

The last four commands are not bundled with IPython–you'll need to get the ``line_profiler`` and ``memory_profiler`` extensions, which we will discuss in the following sections.

## Timing Code Snippets: ``%timeit`` and ``%time``

We saw the ``%timeit`` line-magic and ``%%timeit`` cell-magic in the introduction to magic functions in [IPython Magic Commands](01.03-Magic-Commands.ipynb); it can be used to time the repeated execution of snippets of code:

In [66]:
%timeit sum(range(100))

749 ns ± 61 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


Note that because this operation is so fast, ``%timeit`` automatically does a large number of repetitions.
For slower commands, ``%timeit`` will automatically adjust and perform fewer repetitions:

In [67]:
%%timeit
total = 0
for i in range(1000):
    for j in range(1000):
        total += i * (-1) ** j

155 ms ± 8.11 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


Sometimes repeating an operation is not the best option.
For example, if we have a list that we'd like to sort, we might be misled by a repeated operation.
Sorting a pre-sorted list is much faster than sorting an unsorted list, so the repetition will skew the result:

In [68]:
import random
L = [random.random() for i in range(100000)]
%timeit L.sort()

1.42 ms ± 98.6 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


For this, the ``%time`` magic function may be a better choice. It also is a good choice for longer-running commands, when short, system-related delays are unlikely to affect the result.
Let's time the sorting of an unsorted and a presorted list:

In [69]:
import random
L = [random.random() for i in range(100000)]
print("sorting an unsorted list:")
%time L.sort()

sorting an unsorted list:
CPU times: total: 31.2 ms
Wall time: 35.9 ms


In [70]:
print("sorting an already sorted list:")
%time L.sort()

sorting an already sorted list:
CPU times: total: 0 ns
Wall time: 4.65 ms


Notice how much faster the presorted list is to sort, but notice also how much longer the timing takes with ``%time`` versus ``%timeit``, even for the presorted list!
This is a result of the fact that ``%timeit`` does some clever things under the hood to prevent system calls from interfering with the timing.
For example, it prevents cleanup of unused Python objects (known as *garbage collection*) which might otherwise affect the timing.
For this reason, ``%timeit`` results are usually noticeably faster than ``%time`` results.

For ``%time`` as with ``%timeit``, using the double-percent-sign cell magic syntax allows timing of multiline scripts:

In [71]:
%%time
total = 0
for i in range(1000):
    for j in range(1000):
        total += i * (-1) ** j

CPU times: total: 250 ms
Wall time: 301 ms


For more information on ``%time`` and ``%timeit``, as well as their available options, use the IPython help functionality (i.e., type ``%time?`` at the IPython prompt).

## Profiling Full Scripts: ``%prun``

A program is made of many single statements, and sometimes timing these statements in context is more important than timing them on their own.
Python contains a built-in code profiler (which you can read about in the Python documentation), but IPython offers a much more convenient way to use this profiler, in the form of the magic function ``%prun``.

By way of example, we'll define a simple function that does some calculations:

In [72]:
def sum_of_lists(N):
    total = 0
    for i in range(5):
        L = [j ^ (j >> i) for j in range(N)]
        total += sum(L)
    return total

Now we can call ``%prun`` with a function call to see the profiled results:

In [73]:
%prun sum_of_lists(1000000)

 

         229 function calls (223 primitive calls) in 0.760 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.547    0.547    0.587    0.587 3519952779.py:1(sum_of_lists)
      2/1    0.110    0.055    0.047    0.047 history.py:845(writeout_cache)
        5    0.041    0.008    0.041    0.008 {built-in method builtins.sum}
        1    0.032    0.032    0.032    0.032 {method 'execute' of 'sqlite3.Connection' objects}
        2    0.015    0.008    0.015    0.008 {method '__exit__' of 'sqlite3.Connection' objects}
        1    0.015    0.015    0.602    0.602 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
      2/1    0.000    0.000    0.047    0.047 history.py:55(only_when_enabled)
        1    0.000    0.000    0.000    0.000 inspect.py:2945(apply_defaults)
        1    0.000    0.000    0.000    0.000 inspect.py:3129(_bind)
        1    0.00

In the notebook, the output is printed to the pager, and looks something like this:

```
14 function calls in 0.714 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        5    0.599    0.120    0.599    0.120 <ipython-input-19>:4(<listcomp>)
        5    0.064    0.013    0.064    0.013 {built-in method sum}
        1    0.036    0.036    0.699    0.699 <ipython-input-19>:1(sum_of_lists)
        1    0.014    0.014    0.714    0.714 <string>:1(<module>)
        1    0.000    0.000    0.714    0.714 {built-in method exec}
```

The result is a table that indicates, in order of total time on each function call, where the execution is spending the most time. In this case, the bulk of execution time is in the list comprehension inside ``sum_of_lists``.
From here, we could start thinking about what changes we might make to improve the performance in the algorithm.

For more information on ``%prun``, as well as its available options, use the IPython help functionality (i.e., type ``%prun?`` at the IPython prompt).

## Line-By-Line Profiling with ``%lprun``

The function-by-function profiling of ``%prun`` is useful, but sometimes it's more convenient to have a line-by-line profile report.
This is not built into Python or IPython, but there is a ``line_profiler`` package available for installation that can do this.
Start by using Python's packaging tool, ``pip``, to install the ``line_profiler`` package:

```
$ pip install line_profiler
```

Next, you can use IPython to load the ``line_profiler`` IPython extension, offered as part of this package:

In [75]:
pip install line_profiler

Collecting line_profiler
  Downloading line_profiler-4.1.3-cp312-cp312-win_amd64.whl.metadata (35 kB)
Downloading line_profiler-4.1.3-cp312-cp312-win_amd64.whl (126 kB)
Installing collected packages: line_profiler
Successfully installed line_profiler-4.1.3
Note: you may need to restart the kernel to use updated packages.


In [76]:
%load_ext line_profiler

Now the ``%lprun`` command will do a line-by-line profiling of any function–in this case, we need to tell it explicitly which functions we're interested in profiling:

In [77]:
%lprun -f sum_of_lists sum_of_lists(5000)

Timer unit: 1e-07 s

Total time: 0.0205745 s
File: C:\Users\Black\AppData\Local\Temp\ipykernel_5328\3519952779.py
Function: sum_of_lists at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
     1                                           def sum_of_lists(N):
     2         1         11.0     11.0      0.0      total = 0
     3         6         81.0     13.5      0.0      for i in range(5):
     4     25005     202177.0      8.1     98.3          L = [j ^ (j >> i) for j in range(N)]
     5         5       3454.0    690.8      1.7          total += sum(L)
     6         1         22.0     22.0      0.0      return total

As before, the notebook sends the result to the pager, but it looks something like this:

```
Timer unit: 1e-06 s

Total time: 0.009382 s
File: <ipython-input-19-fa2be176cc3e>
Function: sum_of_lists at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     1                                           def sum_of_lists(N):
     2         1            2      2.0      0.0      total = 0
     3         6            8      1.3      0.1      for i in range(5):
     4         5         9001   1800.2     95.9          L = [j ^ (j >> i) for j in range(N)]
     5         5          371     74.2      4.0          total += sum(L)
     6         1            0      0.0      0.0      return total
```

The information at the top gives us the key to reading the results: the time is reported in microseconds and we can see where the program is spending the most time.
At this point, we may be able to use this information to modify aspects of the script and make it perform better for our desired use case.

For more information on ``%lprun``, as well as its available options, use the IPython help functionality (i.e., type ``%lprun?`` at the IPython prompt).

## Profiling Memory Use: ``%memit`` and ``%mprun``

Another aspect of profiling is the amount of memory an operation uses.
This can be evaluated with another IPython extension, the ``memory_profiler``.
As with the ``line_profiler``, we start by ``pip``-installing the extension:

```
$ pip install memory_profiler
```

Then we can use IPython to load the extension:

In [78]:
pip install memory_profiler

Collecting memory_profiler
  Downloading memory_profiler-0.61.0-py3-none-any.whl.metadata (20 kB)
Downloading memory_profiler-0.61.0-py3-none-any.whl (31 kB)
Installing collected packages: memory_profiler
Successfully installed memory_profiler-0.61.0
Note: you may need to restart the kernel to use updated packages.


In [79]:
%load_ext memory_profiler

The memory profiler extension contains two useful magic functions: the ``%memit`` magic (which offers a memory-measuring equivalent of ``%timeit``) and the ``%mprun`` function (which offers a memory-measuring equivalent of ``%lprun``).
The ``%memit`` function can be used rather simply:

In [80]:
%memit sum_of_lists(1000000)

peak memory: 157.74 MiB, increment: 73.36 MiB


We see that this function uses about 100 MB of memory.

For a line-by-line description of memory use, we can use the ``%mprun`` magic.
Unfortunately, this magic works only for functions defined in separate modules rather than the notebook itself, so we'll start by using the ``%%file`` magic to create a simple module called ``mprun_demo.py``, which contains our ``sum_of_lists`` function, with one addition that will make our memory profiling results more clear:

In [81]:
%%file mprun_demo.py
def sum_of_lists(N):
    total = 0
    for i in range(5):
        L = [j ^ (j >> i) for j in range(N)]
        total += sum(L)
        del L # remove reference to L
    return total

Writing mprun_demo.py


We can now import the new version of this function and run the memory line profiler:

In [82]:
from mprun_demo import sum_of_lists
%mprun -f sum_of_lists sum_of_lists(1000000)




Filename: C:\Users\Black\Documents\Jupyter\icd2024 Libro\mprun_demo.py

Line #    Mem usage    Increment  Occurrences   Line Contents
     1     86.8 MiB     86.8 MiB           1   def sum_of_lists(N):
     2     86.8 MiB      0.0 MiB           1       total = 0
     3     88.3 MiB      0.0 MiB           6       for i in range(5):
     4    121.0 MiB -57770708.1 MiB     5000005           L = [j ^ (j >> i) for j in range(N)]
     5    121.0 MiB     -0.0 MiB           5           total += sum(L)
     6     88.3 MiB   -139.3 MiB           5           del L # remove reference to L
     7     88.3 MiB      0.0 MiB           1       return total

The result, printed to the pager, gives us a summary of the memory use of the function, and looks something like this:
```
Filename: ./mprun_demo.py

Line #    Mem usage    Increment   Line Contents
================================================
     4     71.9 MiB      0.0 MiB           L = [j ^ (j >> i) for j in range(N)]


Filename: ./mprun_demo.py

Line #    Mem usage    Increment   Line Contents
================================================
     1     39.0 MiB      0.0 MiB   def sum_of_lists(N):
     2     39.0 MiB      0.0 MiB       total = 0
     3     46.5 MiB      7.5 MiB       for i in range(5):
     4     71.9 MiB     25.4 MiB           L = [j ^ (j >> i) for j in range(N)]
     5     71.9 MiB      0.0 MiB           total += sum(L)
     6     46.5 MiB    -25.4 MiB           del L # remove reference to L
     7     39.1 MiB     -7.4 MiB       return total
```
Here the ``Increment`` column tells us how much each line affects the total memory budget: observe that when we create and delete the list ``L``, we are adding about 25 MB of memory usage.
This is on top of the background memory usage from the Python interpreter itself.

For more information on ``%memit`` and ``%mprun``, as well as their available options, use the IPython help functionality (i.e., type ``%memit?`` at the IPython prompt).

# Capítulo 01.08

# More IPython Resources

In this chapter, we've just scratched the surface of using IPython to enable data science tasks.
Much more information is available both in print and on the Web, and here we'll list some other resources that you may find helpful.

## Web Resources

- [The IPython website](http://ipython.org): The IPython website links to documentation, examples, tutorials, and a variety of other resources.
- [The nbviewer website](http://nbviewer.jupyter.org/): This site shows static renderings of any IPython notebook available on the internet. The front page features some example notebooks that you can browse to see what other folks are using IPython for!
- [A gallery of interesting Jupyter Notebooks](https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks/): This ever-growing list of notebooks, powered by nbviewer, shows the depth and breadth of numerical analysis you can do with IPython. It includes everything from short examples and tutorials to full-blown courses and books composed in the notebook format!
- Video Tutorials: searching the Internet, you will find many video-recorded tutorials on IPython. I'd especially recommend seeking tutorials from the PyCon, SciPy, and PyData conferenes by Fernando Perez and Brian Granger, two of the primary creators and maintainers of IPython and Jupyter.

## Books

- [*Python for Data Analysis*](http://shop.oreilly.com/product/0636920023784.do): Wes McKinney's book includes a chapter that covers using IPython as a data scientist. Although much of the material overlaps what we've discussed here, another perspective is always helpful.
- [*Learning IPython for Interactive Computing and Data Visualization*](https://www.packtpub.com/big-data-and-business-intelligence/learning-ipython-interactive-computing-and-data-visualization): This short book by Cyrille Rossant offers a good introduction to using IPython for data analysis.
- [*IPython Interactive Computing and Visualization Cookbook*](https://www.packtpub.com/big-data-and-business-intelligence/ipython-interactive-computing-and-visualization-cookbook): Also by Cyrille Rossant, this book is a longer and more advanced treatment of using IPython for data science. Despite its name, it's not just about IPython–it also goes into some depth on a broad range of data science topics.

Finally, a reminder that you can find help on your own: IPython's ``?``-based help functionality (discussed in [Help and Documentation in IPython](01.01-Help-And-Documentation.ipynb)) can be very useful if you use it well and use it often.
As you go through the examples here and elsewhere, this can be used to familiarize yourself with all the tools that IPython has to offer.