# Python Language Basics, IPython, and Jupyter Notebooks

## The Python Interpreter[](#python_interpreter)

- Python is an _interpreted_ language. 
- The Python interpreter runs a program by executing one statement at a time. 

The standard interactive Python interpreter can be invoked on the command line with the `python` command:

    $ python
    Python 3.10.4 | packaged by conda-forge | (main, Mar 24 2022, 17:38:57)
    [GCC 10.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> a = 5
    >>> print(a)
    5

- The `>>>` you see is the _prompt_ after which you’ll type code expressions. 
- To exit the Python interpreter, you can either type `exit()` or press Ctrl-D (works on Linux and macOS only).

- Running Python programs is as simple as calling `python` with a _.py_ file as its first argument. 

Suppose we had created _hello\_world.py_ with these contents:

    print("Hello world")

You can run it by executing the following command (the _hello\_world.py_ file must be in your current working terminal directory):

    $ python hello_world.py
    Hello world

- While some Python programmers execute all of their Python code in this way, those doing data analysis or scientific computing make use of IPython, an enhanced Python interpreter, or Jupyter notebooks, web-based code notebooks originally created within the IPython project. 

When you use the `%run` command, IPython executes the code in the specified file in the same process, enabling you to explore the results interactively when it’s done:

    $ ipython
    Python 3.10.4 | packaged by conda-forge | (main, Mar 24 2022, 17:38:57)
    Type 'copyright', 'credits' or 'license' for more information
    IPython 7.31.1 -- An enhanced Interactive Python. Type '?' for help.
    
    In [1]: %run hello_world.py
    Hello world
    
    In [2]:
    
The default IPython prompt adopts the numbered `In [2]:` style, compared with the standard `>>>` prompt.

## IPython Basics[](#ipython_basics)

### Running the IPython Shell[](#ipython_basics_shell)

You can launch the IPython shell on the command line just like launching the regular Python interpreter except with the `ipython` command:

    $ ipython
    Python 3.10.4 | packaged by conda-forge | (main, Mar 24 2022, 17:38:57)
    Type 'copyright', 'credits' or 'license' for more information
    IPython 7.31.1 -- An enhanced Interactive Python. Type '?' for help.
    
    In [1]: a = 5
    
    In [2]: a
    Out[2]: 5

You can execute arbitrary Python statements by typing them and pressing Return (or Enter). When you type just a variable into IPython, it renders a string representation of the object:

    In [5]: import numpy as np
    
    In [6]: data = [np.random.standard_normal() for i in range(7)]
    
    In [7]: data
    Out[7]: 
    [-0.20470765948471295,
     0.47894333805754824,
     -0.5194387150567381,
     -0.55573030434749,
     1.9657805725027142,
     1.3934058329729904,
     0.09290787674371767]

The first two lines are Python code **statements**; the second statement creates a variable named `data` that refers to a newly created Python dictionary. The last line prints the value of `data` in the console.

Many kinds of Python objects are formatted to be more readable, or _pretty-printed_, which is distinct from normal printing with `print`. If you printed the above `data` variable in the standard Python interpreter, it would be much less readable:

    >>> import numpy as np
    >>> data = [np.random.standard_normal() for i in range(7)]
    >>> print(data)
    >>> data
    [-0.5767699931966723, -0.1010317773535111, -1.7841005313329152,
    -1.524392126408841, 0.22191374220117385, -1.9835710588082562,
    -1.6081963964963528]

### Running the Jupyter Notebook[](#ipython_basics_notebook)

- One of the major components of the Jupyter project is the _notebook_, a type of interactive document for code, text (including Markdown), data visualizations, and other output. 
- The Jupyter notebook interacts with _kernels_, which are implementations of the Jupyter interactive computing protocol specific to different programming languages. 
- The Python Jupyter kernel uses the IPython system for its underlying behavior.

To start up Jupyter, run the command `jupyter` `notebook` in a terminal:

    $ jupyter notebook
    [I 15:20:52.739 NotebookApp] Serving notebooks from local directory:
    /home/wesm/code/pydata-book
    [I 15:20:52.739 NotebookApp] 0 active kernels
    [I 15:20:52.739 NotebookApp] The Jupyter Notebook is running at:
    http://localhost:8888/?token=0a77b52fefe52ab83e3c35dff8de121e4bb443a63f2d...
    [I 15:20:52.740 NotebookApp] Use Control-C to stop this server and shut down
    all kernels (twice to skip confirmation).
    Created new window in existing browser session.
        To access the notebook, open this file in a browser:
            file:///home/wesm/.local/share/jupyter/runtime/nbserver-185259-open.html
        Or copy and paste one of these URLs:
            http://localhost:8888/?token=0a77b52fefe52ab83e3c35dff8de121e4...
         or http://127.0.0.1:8888/?token=0a77b52fefe52ab83e3c35dff8de121e4...

On many platforms, Jupyter will automatically open in your default web browser (unless you start it with `--no-browser`). Otherwise, you can navigate to the HTTP address printed when you started the notebook

Note

Many people use Jupyter as a local computing environment, but it can also be deployed on servers and accessed remotely.

![pda3_0201.png](attachment:576ac0a2-fe76-480c-952a-5fd2f9830981.png)

- To create a new notebook, click the New button and select the "Python 3" option. 
- try clicking on the empty code "cell" and entering a line of Python code. Then press Shift-Enter to execute it.

![pda3_0202.png](attachment:e97b12cc-db88-471f-af57-bd23363eec49.png)

- When you save the notebook (see "Save and Checkpoint" under the notebook File menu), it creates a file with the extension _.ipynb_. 
- This is a self-contained file format that contains all of the content (including any evaluated code output) currently in the notebook. 
- These can be loaded and edited by other Jupyter users.

- To rename an open notebook, click on the notebook title at the top of the page and type the new title, pressing Enter when you are finished.
- To load an existing notebook, put the file in the same directory where you started the notebook process (or in a subfolder within it), then click the name from the landing page. 
- When you want to close a notebook, click the File menu and select "Close and Halt." 
    - If you simply close the browser tab, the Python process associated with the notebook will keep running in the background.


![pda3_0203.png](attachment:b8cdf7ca-3cf8-44ac-aa18-c0157d250ce1.png)
Figure 2.3: Jupyter example view for an existing notebook

[](#fig-figure_jupyter_existing_nb)

### Tab Completion[](#ipython_completion)

While entering expressions in the shell, pressing the Tab key will search the namespace for any variables (objects, functions, etc.) matching the characters you have typed so far and show the results in a convenient drop-down menu:

    In [1]: an_apple = 27
    
    In [2]: an_example = 42
    
    In [3]: an<Tab>
    an_apple   an_example  any

Also, you can also complete methods and attributes on any object after typing a period:

    In [3]: b = [1, 2, 3]
    
    In [4]: b.<Tab>
    append()  count()   insert()  reverse()
    clear()   extend()  pop()     sort()
    copy()    index()   remove()

The same is true for modules:

    In [1]: import datetime
    
    In [2]: datetime.<Tab>
    date          MAXYEAR       timedelta
    datetime      MINYEAR       timezone
    datetime_CAPI time          tzinfo

Note

Note that IPython by default hides methods and attributes starting with underscores, such as magic methods and internal “private” methods and attributes, in order to avoid cluttering the display (and confusing novice users!). These, too, can be tab-completed, but you must first type an underscore to see them. 

When typing anything that looks like a file path (even in a Python string), pressing the Tab key will complete anything on your computer’s filesystem matching what you’ve typed.

Another area where tab completion saves time is in the completion of function keyword arguments (including the `=` sign!). 

![pda3_0204.png](attachment:91acb799-91b3-43d0-8d45-ce331277718f.png)


### Introspection[](#ipython_introspection)

Using a question mark (`?`) before or after a variable will display some general information about the object:

    In [1]: b = [1, 2, 3]
    
    In [2]: b?
    Type:        list
    String form: [1, 2, 3]
    Length:      3
    Docstring:
    Built-in mutable sequence.
    
    If no argument is given, the constructor creates a new empty list.
    The argument must be an iterable if specified.

    
    In [3]: print?
    Docstring:
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.
    Type:      builtin_function_or_method

- This is referred to as _object introspection_. 
- If the object is a function or instance method, the docstring, if defined, will also be shown. 

Suppose we’d written the following function (which you can reproduce in IPython or Jupyter):

```python
    def add_numbers(a, b):
        """
        Add two numbers together
    
        Returns
        -------
        the_sum : type of arguments
        """
        return a + b
```

Then using `?` shows us the docstring:

    In [6]: add_numbers?
    Signature: add_numbers(a, b)
    Docstring:
    Add two numbers together
    Returns
    -------
    the_sum : type of arguments
    File:      <ipython-input-9-6a548a216e27>
    Type:      function

`?` has a final usage, which is for searching the IPython namespace in a manner similar to the standard Unix or Windows command line. A number of characters combined with the wildcard (`*`) will show all names matching the wildcard expression. For example, we could get a list of all functions in the top-level NumPy namespace containing `load`:

    In [9]: import numpy as np
    
    In [10]: np.*load*?
    np.__loader__
    np.load
    np.loads
    np.loadtxt

## Python Language Basics[](#tut_basics)

### Language Semantics[](#language_semantics)

The Python language design is distinguished by its emphasis on readability, simplicity, and explicitness. Some people go so far as to liken it to “executable pseudocode.”

#### Indentation, not braces[](#semantics_whitespace)

Python uses whitespace (tabs or spaces) to structure code instead of using braces as in many other languages like R, C++, Java, and Perl. Consider a `for` loop from a sorting algorithm:

```python
    for x in array:
        if x < pivot:
            less.append(x)
        else:
            greater.append(x)
```

A colon denotes the start of an indented code block after which all of the code must be indented by the same amount until the end of the block.

Note

Python statements also do not need to be terminated by semicolons. Semicolons can be used, however, to separate multiple statements on a single line:
```python
    a = 5; b = 6; c = 7
```
Putting multiple statements on one line is generally discouraged in Python as it can make code less readable.

#### Everything is an object[](#semantics_everything_object)

- Every number, string, data structure, function, class, module, and so on exists in the Python interpreter in its own “box,” which is referred to as a _Python object_. 
- Each object has an associated _type_ (e.g., _integer_, _string_, or _function_) and internal data. 
- In practice this makes the language very flexible, as even functions can be treated like any other object.

#### Comments[](#semantics_comments)

- Any text preceded by the hash mark (pound sign) `#` is ignored by the Python interpreter. 
- This is often used to add comments to code. 
- At times you may also want to exclude certain blocks of code without deleting them. 

One solution is to _comment out_ the code:
```python
    results = []
    for line in file_handle:
        # keep the empty lines for now
        # if len(line) == 0:
        #   continue
        results.append(line.replace("foo", "bar"))
```

Comments can also occur after a line of executed code. While some programmers prefer comments to be placed in the line preceding a particular line of code, this can be useful at times:

```python
    print("Reached this line")  # Simple status report
```

#### Function and object method calls[](#semantics_function_calls)

You call functions using parentheses and passing zero or more arguments, optionally assigning the returned value to a variable:
```python
    result = f(x, y, z)
    g()
```

Almost every object in Python has attached functions, known as _methods_, that have access to the object’s internal contents. You can call them using the following syntax:

```python
    obj.some_method(x, y, z)
```

Functions can take both _positional_ and _keyword_ arguments:
```python
    result = f(a, b, c, d=5, e="foo")
```

#### Variables and argument passing[](#semantics_references)

- When assigning a variable (or _name_) in Python, you are creating a _reference_ to the object shown on the righthand side of the equals sign. 

In practical terms, consider a list of integers:
```python
    a = [1, 2, 3]
```
Suppose we assign `a` to a new variable `b`:

```python
    b = a
    
    b
```
    Out[10]: [1, 2, 3]

In some languages, the assignment if `b` will cause the data `[1, 2, 3]` to be copied. In Python, `a` and `b` actually now refer to the same object, the original list `[1, 2, 3]` 

![pda3_0205.png](attachment:a816f4ce-4334-4947-83f7-e567a93e1385.png)


You can prove this to yourself by appending an element to `a` and then examining `b`:

```python
    a.append(4)
    
    b
```
    Out[12]: [1, 2, 3, 4]

Note

Assignment is also referred to as _binding_, as we are binding a name to an object. Variable names that have been assigned may occasionally be referred to as bound variables.

#### Dynamic references, strong types[](#semantics_strongly_typed)

Variables in Python have no inherent type associated with them; a variable can refer to a different type of object simply by doing an assignment. There is no problem with the following:

In [1]:
a = 5
    
type(a)

int

In [2]:
    
a = "foo"
    
type(a)

str

Variables are names for objects within a particular namespace; the type information is stored in the object itself. 

In [3]:
"5" + 5

TypeError: can only concatenate str (not "int") to str

In some languages, the string `'5'` might get implicitly converted (or _cast_) to an integer, thus yielding 10. In other languages the integer `5` might be cast to a string, yielding the concatenated string `'55'`. In Python, such implicit casts are not allowed. 

In this regard we say that Python is a _strongly typed_ language, which means that every object has a specific type (or _class_), and implicit conversions will occur only in certain permitted circumstances, such as:

In [4]:
a = 4.5
    
b = 2
    
# String formatting, to be visited later
print(f"a is {type(a)}, b is {type(b)}")

a is <class 'float'>, b is <class 'int'>


In [5]:
a / b

2.25

Here, even though `b` is an integer, it is implicitly converted to a float for the division operation.

#### Attributes and methods[](#attributes_and_methods)

Objects in Python typically have both attributes (other Python objects stored “inside” the object) and methods (functions associated with an object that can have access to the object’s internal data). Both of them are accessed via the syntax <obj.attribute\_name>:

In [None]:
a = "foo"
    
#press TAB for completion
a.

#### Imports[](#semantics_imports)

In Python, a _module_ is simply a file with the _.py_ extension containing Python code. 

Suppose we had the following module:

```python
    # some_module.py
    PI = 3.14159
    
    def f(x):
        return x + 2
    
    def g(a, b):
        return a + b
```

If we wanted to access the variables and functions defined in _some\_module.py_, from another file in the same directory we could do:
```python
    import some_module
    result = some_module.f(5)
    pi = some_module.PI
```

Or alternately:
```python
    from some_module import g, PI
    result = g(5, PI)
```

By using the `as` keyword, you can give imports different variable names:
```python
    import some_module as sm
    from some_module import PI as pi, g as gf
    
    r1 = sm.f(pi)
    r2 = gf(6, pi)
```

#### Binary operators and comparisons[](#semantics_binary_ops)

Most of the binary math operations and comparisons use familiar mathematical syntax used in other programming languages:

In [None]:
5 - 7

In [None]:
    
12 + 21.5

In [None]:
    
5 <= 2

See [Table 2.1](https://wesmckinney.com/book/python-basics.html#tbl-table_binary_ops) for all of the available binary operators.


To check if two variables refer to the same object, use the `is` keyword. Use `is` `not` to check that two objects are not the same:

In [7]:
a = [1, 2, 3]
    
b = a
    
c = list(a)

In [8]:
    
a is b

    

True

In [9]:
a is not c

True

Since the `list` function always creates a new Python list (i.e., a copy), we can be sure that `c` is distinct from `a`. Comparing with `is` is not the same as the `==` operator, because in this case we have:

In [10]:
a == c

True

A common use of `is` and `is` `not` is to check if a variable is `None`, since there is only one instance of `None`:

In [11]:
a = None
    
a is None

True

#### Mutable and immutable objects[](#semantics_mutability)

Many objects in Python, such as lists, dictionaries, NumPy arrays, and most user-defined types (classes), are _mutable_. This means that the object or values that they contain can be modified:

In [12]:
a_list = ["foo", 2, [4, 5]]
    
a_list[2] = (3, 4)
    
a_list

['foo', 2, (3, 4)]

Others, like strings and tuples, are immutable, which means their internal data cannot be changed:

In [13]:
a_tuple = (3, 5, (4, 5))
    
a_tuple[1] = "four"

TypeError: 'tuple' object does not support item assignment

Remember that just because you _can_ mutate an object does not mean that you always _should_. Such actions are known as _side effects_. For example, when writing a function, any side effects should be explicitly communicated to the user in the function’s documentation or comments. If possible, try to avoid side effects and _favor immutability_, even though there may be mutable objects involved.

### Scalar Types[](#scalar_types)

Python has a small set of built-in types for handling numerical data, strings, Boolean (`True` or `False`) values, and dates and time. These "single value" types are sometimes called _scalar types_, and we refer to them in this book as _scalars_ . See [Table 2.2](https://wesmckinney.com/book/python-basics.html#tbl-table_python_scalar_types) for a list of the main scalar types. 

Date and time handling will be discussed separately, as these are provided by the `datetime` module in the standard library.

#### Numeric types[](#scalar_numeric)

The primary Python types for numbers are `int` and `float`. An `int` can store arbitrarily large numbers:

In [14]:
ival = 17239871
    
ival ** 6

26254519291092456596965462913230729701102721

Floating-point numbers are represented with the Python `float` type. Under the hood, each one is a double-precision value. They can also be expressed with scientific notation:

In [15]:
fval = 7.243
    
fval2 = 6.78e-5

Integer division not resulting in a whole number will always yield a floating-point number:

In [None]:
3 / 2

To get C-style integer division (which drops the fractional part if the result is not a whole number), use the floor division operator `//`:

In [None]:
3 // 2

#### Strings[](#scalar_strings)

Many people use Python for its built-in string handling capabilities. You can write _string literals_ using either single quotes `'` or double quotes `"` (double quotes are generally favored):

In [16]:
    a = 'one way of writing a string'
    b = "another way"

The Python string type is `str`.

For multiline strings with line breaks, you can use triple quotes, either `'''` or `"""`:

In [17]:
    c = """
    This is a longer string that
    spans multiple lines
    """

It may surprise you that this string `c` actually contains four lines of text; the line breaks after `"""` and after `lines` are included in the string. We can count the new line characters with the `count` method on `c`:

In [18]:
c.count("\n")

3

Python strings are immutable; you cannot modify a string:

In [19]:
a = "this is a string"
    
a[10] = "f"

TypeError: 'str' object does not support item assignment

To interpret this error message, read from the bottom up. We tried to replace the character (the "item") at position 10 with the letter `"f"`, but this is not allowed for string objects. 

If we need to modify a string, we have to use a function or method that creates a new string, such as the string `replace` method:

In [None]:
b = a.replace("string", "longer string")
    
b

Afer this operation, the variable `a` is unmodified:

In [None]:
a

Many Python objects can be converted to a string using the `str` function:

In [20]:
a = 5.6
    
s = str(a)
    
print(s)

5.6


Strings are a sequence of Unicode characters and therefore can be treated like other sequences, such as lists and tuples:

In [21]:
s = "python"
    
list(s)

['p', 'y', 't', 'h', 'o', 'n']

In [22]:
s[:3]

'pyt'

The syntax `s[:3]` is called _slicing_ and is implemented for many kinds of Python sequences. This will be explained in more detail later on

The backslash character `\` is an _escape character_, meaning that it is used to specify special characters like newline `\n` or Unicode characters. To write a string literal with backslashes, you need to escape them:

In [23]:
s = "12\\34"
    
print(s)

12\34


If you have a string with a lot of backslashes and no special characters, you might find this a bit annoying. Fortunately you can preface the leading quote of the string with `r`, which means that the characters should be interpreted as is:

In [24]:
s = r"this\has\no\special\characters"
    
s

'this\\has\\no\\special\\characters'

The `r` stands for _raw_.

Adding two strings together concatenates them and produces a new string:

In [25]:
a = "this is the first half "
    
b = "and this is the second half"
    
a + b

'this is the first half and this is the second half'

String templating or formatting is another important topic. String objects have a `format` method that can be used to substitute formatted arguments into the string, producing a new string:

In [26]:
template = "{0:.2f} {1:s} are worth US${2:d}"

In this string:

*   `{0:.2f}` means to format the first argument as a floating-point number with two decimal places.
    
*   `{1:s}` means to format the second argument as a string.
    
*   `{2:d}` means to format the third argument as an exact integer.

To substitute arguments for these format parameters, we pass a sequence of arguments to the `format` method:

In [27]:
template.format(88.46, "Argentine Pesos", 1)

'88.46 Argentine Pesos are worth US$1'

Python 3.6 introduced a new feature called _f-strings_ (short for _formatted string literals_) which can make creating formatted strings even more convenient. To create an f-string, write the character `f` immediately preceding a string literal. Within the string, enclose Python expressions in curly braces to substitute the value of the expression into the formatted string:

In [30]:
amount = 10
    
rate = 88.46
    
currency = "Pesos"
    
result = f"{amount} {currency} is worth US${amount / rate}"

print(result)

10 Pesos is worth US$0.11304544426859599


Format specifiers can be added after each expression using the same syntax as with the string templates above:

In [29]:
f"{amount} {currency} is worth US${amount / rate:.2f}"

'10 Pesos is worth US$0.11'

String formatting is a deep topic; there are multiple methods and numerous options and tweaks available to control how values are formatted in the resulting string. To learn more, consult the [official Python documentation](https://docs.python.org/3/library/string.html).

#### Bytes and Unicode[](#scalar_bytes)

In modern Python (i.e., Python 3.0 and up), Unicode has become the first-class string type to enable more consistent handling of ASCII and non-ASCII text. In older versions of Python, strings were all bytes without any explicit Unicode encoding. You could convert to Unicode assuming you knew the character encoding. Here is an example Unicode string with non-ASCII characters:

In [31]:
val = "español"
    
val

'español'

We can convert this Unicode string to its UTF-8 bytes representation using the `encode` method:

In [32]:
val_utf8 = val.encode("utf-8")
    
val_utf8

b'espa\xc3\xb1ol'

In [33]:
type(val_utf8)

bytes

Assuming you know the Unicode encoding of a `bytes` object, you can go back using the `decode` method:

In [34]:
val_utf8.decode("utf-8")

'español'

While it is now preferable to use UTF-8 for any encoding, for historical reasons you may encounter data in any number of different encodings:

In [35]:
val.encode("latin1")

b'espa\xf1ol'

In [36]:
  
val.encode("utf-16")

b'\xff\xfee\x00s\x00p\x00a\x00\xf1\x00o\x00l\x00'

In [37]:
val.encode("utf-16le")

b'e\x00s\x00p\x00a\x00\xf1\x00o\x00l\x00'

It is most common to encounter `bytes` objects in the context of working with files, where implicitly decoding all data to Unicode strings may not be desired.

#### Booleans[](#scalar_boolean)

The two Boolean values in Python are written as `True` and `False`. Comparisons and other conditional expressions evaluate to either `True` or `False`. Boolean values are combined with the `and` and `or` keywords:

True and True

    
False or True


When converted to numbers, `False` becomes `0` and `True` becomes `1`:

int(False)

    
int(True)


The keyword `not` flips a Boolean value from `True` to `False` or vice versa:

a = True
    
b = False
    
not a

    
not b


#### Type casting[](#scalar_casting)

The `str`, `bool`, `int`, and `float` types are also functions that can be used to cast values to those types:

s = "3.14159"
    
fval = float(s)
    
type(fval)

    
int(fval)

    
bool(fval)

    
bool(0)


Note that most nonzero values when cast to `bool` become `True`.

#### None[](#scalar_none)

`None` is the Python null value type:

a = None
    
a is None

    
b = 5
    
b is not None


`None` is also a common default value for function arguments:

    def add_and_maybe_multiply(a, b, c=None):
        result = a + b
    
        if c is not None:
            result = result * c
    
        return result

#### Dates and times[](#scalar_dates)

The built-in Python `datetime` module provides `datetime`, `date`, and `time` types. The `datetime` type combines the information stored in `date` and `time` and is the most commonly used:

from datetime import datetime, date, time
    
dt = datetime(2011, 10, 29, 20, 30, 21)
    
dt.day

    
dt.minute


Given a `datetime` instance, you can extract the equivalent `date` and `time` objects by calling methods on the `datetime` of the same name:

dt.date()

    
dt.time()


The `strftime` method formats a `datetime` as a string:

dt.strftime("%Y-%m-%d %H:%M")


Strings can be converted (parsed) into `datetime` objects with the `strptime` function:

datetime.strptime("20091031", "%Y%m%d")


See [???](https://wesmckinney.com/book/python-basics.html#table_datetime_formatting) for a full list of format specifications.

When you are aggregating or otherwise grouping time series data, it will occasionally be useful to replace time fields of a series of `datetime`s—for example, replacing the `minute` and `second` fields with zero:

dt_hour = dt.replace(minute=0, second=0)
    
dt_hour


Since `datetime.datetime` is an immutable type, methods like these always produce new objects. So in the previous example, `dt` is not modified by `replace`:

dt


The difference of two `datetime` objects produces a `datetime.timedelta` type:

dt2 = datetime(2011, 11, 15, 22, 30)
    
delta = dt2 - dt
    
delta

    
type(delta)


The output `timedelta(17, 7179)` indicates that the `timedelta` encodes an offset of 17 days and 7,179 seconds.

Adding a `timedelta` to a `datetime` produces a new shifted `datetime`:

dt

    
dt + delta


### Control Flow[](#control_flow)

Python has several built-in keywords for conditional logic, loops, and other standard _control flow_ concepts found in other programming languages.

#### if, elif, and else[](#control_if_else)

The `if` statement is one of the most well-known control flow statement types. It checks a condition that, if `True`, evaluates the code in the block that follows:

    x = -5
    if x < 0:
        print("It's negative")

An `if` statement can be optionally followed by one or more `elif` blocks and a catchall `else` block if all of the conditions are `False`:

    if x < 0:
        print("It's negative")
    elif x == 0:
        print("Equal to zero")
    elif 0 < x < 5:
        print("Positive but smaller than 5")
    else:
        print("Positive and larger than or equal to 5")

If any of the conditions are `True`, no further `elif` or `else` blocks will be reached. With a compound condition using `and` or `or`, conditions are evaluated left to right and will short-circuit:

a = 5; b = 7
    
c = 8; d = 4
    
if a < b or c > d:
       .....:     print("Made it")
    Made it

In this example, the comparison `c > d` never gets evaluated because the first comparison was `True`.

It is also possible to chain comparisons:

4 > 3 > 2 > 1


#### for loops[](#control_for)

`for` loops are for iterating over a collection (like a list or tuple) or an iterater. The standard syntax for a `for` loop is:

    for value in collection:
        # do something with value

You can advance a `for` loop to the next iteration, skipping the remainder of the block, using the `continue` keyword. Consider this code, which sums up integers in a list and skips `None` values:

    sequence = [1, 2, None, 4, None, 5]
    total = 0
    for value in sequence:
        if value is None:
            continue
        total += value

A `for` loop can be exited altogether with the `break` keyword. This code sums elements of the list until a 5 is reached:

    sequence = [1, 2, 0, 4, 6, 5, 2, 1]
    total_until_5 = 0
    for value in sequence:
        if value == 5:
            break
        total_until_5 += value

The `break` keyword only terminates the innermost `for` loop; any outer `for` loops will continue to run:

for i in range(4):
       .....:     for j in range(4):
       .....:         if j > i:
       .....:             break
       .....:         print((i, j))
       .....:
    (0, 0)
    (1, 0)
    (1, 1)
    (2, 0)
    (2, 1)
    (2, 2)
    (3, 0)
    (3, 1)
    (3, 2)
    (3, 3)

As we will see in more detail, if the elements in the collection or iterator are sequences (tuples or lists, say), they can be conveniently _unpacked_ into variables in the `for` loop statement:

    for a, b, c in iterator:
        # do something

#### while loops[](#control_while)

A `while` loop specifies a condition and a block of code that is to be executed until the condition evaluates to `False` or the loop is explicitly ended with `break`:

    x = 256
    total = 0
    while x > 0:
        if total > 500:
            break
        total += x
        x = x // 2

#### pass[](#control_pass)

`pass` is the “no-op” (or "do nothing") statement in Python. It can be used in blocks where no action is to be taken (or as a placeholder for code not yet implemented); it is required only because Python uses whitespace to delimit blocks:

    if x < 0:
        print("negative!")
    elif x == 0:
        # TODO: put something smart here
        pass
    else:
        print("positive!")

#### range[](#control_range)

The `range` function generates a sequence of evenly spaced integers:

range(10)

    
list(range(10))


A start, end, and step (which may be negative) can be given:

list(range(0, 20, 2))

    
list(range(5, 0, -1))


As you can see, `range` produces integers up to but not including the endpoint. A common use of `range` is for iterating through sequences by index:

seq = [1, 2, 3, 4]
    
for i in range(len(seq)):
       .....:     print(f"element {i}: {seq[i]}")
    element 0: 1
    element 1: 2
    element 2: 3
    element 3: 4

While you can use functions like `list` to store all the integers generated by `range` in some other data structure, often the default iterator form will be what you want. This snippet sums all numbers from 0 to 99,999 that are multiples of 3 or 5:

total = 0
    
for i in range(100_000):
       .....:     # % is the modulo operator
       .....:     if i % 3 == 0 or i % 5 == 0:
       .....:         total += i
    
print(total)
    2333316668

While the range generated can be arbitrarily large, the memory use at any given time may be very small.

## 2.4 Conclusion[](#python_tutorial_summary)

This chapter provided a brief introduction to some basic Python language concepts and the IPython and Jupyter programming environments. In the next chapter, I will discuss many built-in data types, functions, and input-output utilities that will be used continuously throughout the rest of the book.

In [1]:
#! ipython suppress id=b0641b6ae09942babe0c88f69c55482d
%pushd book-materials
import numpy as np
np.random.seed(12345)
np.set_printoptions(precision=4, suppress=True)

In [2]:
#! ipython id=085dfdc5abf744bebb6e84ea614aa99d
import numpy as np
data = [np.random.standard_normal() for i in range(7)]
data

In [3]:
#! ipython id=1a4e6fcb22eb4f89986abfae4c83b48d
a = [1, 2, 3]

In [4]:
#! ipython id=64f8a5dfa016429c8d659bf729003401
b = a
b

In [5]:
#! ipython id=0eced45afa5347248bb2af66fa5c4e2b
a.append(4)
b

In [6]:
#! ipython id=1018283ab59f4f05bfe84afca4fc9936
def append_element(some_list, element):
    some_list.append(element)

In [7]:
#! ipython id=2edb843baee44895b71dbd9474407c9c
data = [1, 2, 3]
append_element(data, 4)
data

In [8]:
#! ipython id=e71481e654d64e649533772a2381a3c6
a = 5
type(a)
a = "foo"
type(a)

In [9]:
#! ipython allow_exceptions id=c9cf6d998c39413a9fc48646f91dec3e
"5" + 5

In [10]:
#! ipython id=0003b825a15647e49598d18de47cbd57
a = 4.5
b = 2
# String formatting, to be visited later
print(f"a is {type(a)}, b is {type(b)}")
a / b

In [11]:
#! ipython id=f3d0ca0ebd384742976ccd83633ac8be
a = 5
isinstance(a, int)

In [12]:
#! ipython id=5dacb5ef6e394fe7a8b3f724cb9cdf4d
a = 5; b = 4.5
isinstance(a, (int, float))
isinstance(b, (int, float))

In [13]:
#! ipython suppress id=7bd32461c0bf45ff9bee3de05a7fa5cf
a = "foo"

In [14]:
#! ipython id=d2aef8895e6f4b67aaa1d6cbd2b6affa
getattr(a, "split")

In [15]:
#! ipython id=f2617db9765f49009c55756ce9cf87cd
def isiterable(obj):
    try:
        iter(obj)
        return True
    except TypeError: # not iterable
        return False

In [16]:
#! ipython id=1ea41933d16e460380316fd7a4e0b1d8
isiterable("a string")
isiterable([1, 2, 3])
isiterable(5)

In [17]:
#! ipython id=84c35fbe613341129d0e94d6fc798e76
5 - 7
12 + 21.5
5 <= 2

In [18]:
#! ipython id=a91f01a0ee1e43e2a000db5609d04789
a = [1, 2, 3]
b = a
c = list(a)
a is b
a is not c

In [19]:
#! ipython id=1fca12df22264d9881690fcaa804149d
a == c

In [20]:
#! ipython id=e6fe568326a94c0d899c9c8b31f0c355
a = None
a is None

In [21]:
#! ipython id=7db7331aee334850b9b17cb4f86e6d18
a_list = ["foo", 2, [4, 5]]
a_list[2] = (3, 4)
a_list

In [22]:
#! ipython allow_exceptions id=2536d7cdf32b4faaa753207cbfcc17fe
a_tuple = (3, 5, (4, 5))
a_tuple[1] = "four"

In [23]:
#! ipython id=51ad7d167b834b85a8eda9bc416c955b
ival = 17239871
ival ** 6

In [24]:
#! ipython id=3350052c32fc49b599322353c27a4586
fval = 7.243
fval2 = 6.78e-5

In [25]:
#! ipython id=38bc4370bb6840c5beb557dead1d6998
3 / 2

In [26]:
#! ipython id=2ccf1adf240348008ed8ce811792dbf8
3 // 2

In [27]:
#! ipython verbatim id=f1454b6b49ff4e2bba6124576e18cd53
c = """
This is a longer string that
spans multiple lines
"""

In [28]:
#! ipython id=03cb238d855d42dda40a45ab91254038
c.count("\n")

In [29]:
#! ipython allow_exceptions id=59788f621484492ba307f10961edbb6c
a = "this is a string"
a[10] = "f"

In [30]:
#! ipython id=f5c3a08f5aaf4706a5122f3ef51f5e89
b = a.replace("string", "longer string")
b

In [31]:
#! ipython id=90017a4e11754eac828714ee61b43ac1
a

In [32]:
#! ipython id=99b192a5b56946cda8707464fb8e2bea
a = 5.6
s = str(a)
print(s)

In [33]:
#! ipython id=5bce41fece2f47bc90fb813f26cfa081
s = "python"
list(s)
s[:3]

In [34]:
#! ipython id=c826919c945643b8adb2dc48b3fa204b
s = "12\\34"
print(s)

In [35]:
#! ipython id=f705192174ec4c849203a659260960e0
s = r"this\has\no\special\characters"
s

In [36]:
#! ipython id=9e52b9bfcf734a899ad5b9deecc6b5f1
a = "this is the first half "
b = "and this is the second half"
a + b

In [37]:
#! ipython id=8ba0128b1e144dfd87b0a96bbb95e090
template = "{0:.2f} {1:s} are worth US${2:d}"

In [38]:
#! ipython id=830e8c337251466d8704122c38e4a31d
template.format(88.46, "Argentine Pesos", 1)

In [39]:
#! ipython id=c5eab29881e7453bbdc2b8d3f1e81924
amount = 10
rate = 88.46
currency = "Pesos"
result = f"{amount} {currency} is worth US${amount / rate}"

In [40]:
#! ipython id=d4d50688c6c145d5a2d11d846f03dba2
f"{amount} {currency} is worth US${amount / rate:.2f}"

In [41]:
#! ipython id=fe3d6a3ffd5c4906858baeb46363aee5
val = "español"
val

In [42]:
#! ipython id=f1369533406f414ea648c7b80c22cba3
val_utf8 = val.encode("utf-8")
val_utf8
type(val_utf8)

In [43]:
#! ipython id=6cfcab789284478f9fe920568aad6276
val_utf8.decode("utf-8")

In [44]:
#! ipython id=9e55ca7c4e2642e5a26f603523e87564
val.encode("latin1")
val.encode("utf-16")
val.encode("utf-16le")

In [45]:
#! ipython id=422e235bb54842adb436742e6caba89c
True and True
False or True

In [46]:
#! ipython id=1051ac23010f466c8f2160c865da145f
int(False)
int(True)

In [47]:
#! ipython id=14dc7378115c43349c92e34020bb3b71
a = True
b = False
not a
not b

In [48]:
#! ipython id=345c0ce8702b41539a102f07716ff00d
s = "3.14159"
fval = float(s)
type(fval)
int(fval)
bool(fval)
bool(0)

In [49]:
#! ipython id=163e5f37123741a88e61ef81c85fedc1
a = None
a is None
b = 5
b is not None

In [50]:
#! ipython id=552d7fff9f2a45a38b54fa44add624c7
from datetime import datetime, date, time
dt = datetime(2011, 10, 29, 20, 30, 21)
dt.day
dt.minute

In [51]:
#! ipython id=a527d76de7e141568e7dd9e5c9f8d8ce
dt.date()
dt.time()

In [52]:
#! ipython id=9bd051a88f7b458fae19e8e156fda078
dt.strftime("%Y-%m-%d %H:%M")

In [53]:
#! ipython id=3fe2b447220b4b8ca70b526d76e31cb5
datetime.strptime("20091031", "%Y%m%d")

In [54]:
#! ipython id=178fb4d50aff4d68a0a7e0ab28ee71f3
dt_hour = dt.replace(minute=0, second=0)
dt_hour

In [55]:
#! ipython id=9ad24ebe03e049b2af9a81ec22a90675
dt

In [56]:
#! ipython id=1a41f2e49eee432cbc4a32f4298e1f3c
dt2 = datetime(2011, 11, 15, 22, 30)
delta = dt2 - dt
delta
type(delta)

In [57]:
#! ipython id=04baa0c20d534202ad4714e5c68d04a4
dt
dt + delta

In [58]:
#! ipython id=6551625a4f864b5fb5ce6b81e5ffd81b
a = 5; b = 7
c = 8; d = 4
if a < b or c > d:
    print("Made it")

In [59]:
#! ipython id=ace8e0c5cbb349cb852d8248394b0f7c
4 > 3 > 2 > 1

In [60]:
#! ipython id=6e3b4a652a894d25906ad84f8b4248ea
#! blockstart
for i in range(4):
    for j in range(4):
        if j > i:
            break
        print((i, j))
#! blockend

In [61]:
#! ipython id=60383320960a45bcb83e21d713d5619c
range(10)
list(range(10))

In [62]:
#! ipython id=71696aba433c43f6b2e05d488837f0f3
list(range(0, 20, 2))
list(range(5, 0, -1))

In [63]:
#! ipython id=c9e25935441242149e5f4b406032113f
seq = [1, 2, 3, 4]
for i in range(len(seq)):
    print(f"element {i}: {seq[i]}")

In [64]:
#! ipython id=430c1852ec4847bcad3afc4a3a68c2f3
total = 0
for i in range(100_000):
    # % is the modulo operator
    if i % 3 == 0 or i % 5 == 0:
        total += i
print(total)

In [65]:
#! ipython suppress id=25d6eae18f4846ed89c817cb487df3b5
%popd