# An Introduction to Python for Scientific Computing  

----

By Adam A Miller (Northwestern/CIERA/SkAI)  
09 Sept 2025

### Why Python? 

(because if this was anything else I wouldn't be able to teach it...)

  *  It is free (+history of being free)

  *  Flexible with a large number of (scientifically relevant) packages

  *  The world has decided that everything will be python

<div style="text-align: center;">
<img src='./images/github-octoverse-2024-top-languages.webp' width='300' />

The flexibility and breath of python provide a lot of strengths (and sometimes the expectation that everyone just "knows" python).

Since we are not software engineers (and most of us are likely self-taught for programming) it is often the case that scientists are not aware of all the building blocks for python. 

$~~~~~~~~\longrightarrow$ Today we will attempt to fill in the basics

### 1) Logic and booleans

An old trope is that computers are just a bunch of ones and zeros. 

(forgiving the fact that I am not a computer scientist) Computers are nothing more than logic machines. 

(I will do my best to avoid tropes, but, as you will quickly learn, I'm a real sucker for bad jokes)

$~~~~~~~~$ Software is a series of commands designed by the programmer to deliver a prescriptive response. 

Within python booleans (i.e., `True` or `False`) are central for assessing logic and determining how computational commands should proceed. 

Normally a boolean will be coupled with conditional statements (e.g., `if`) to control what the program does based on certain conditions.

Boolean logic uses logical operators to combine or modify Boolean values:

`and`: True if both conditions are `True`  
`or`: True if at least one condition is `True`  
`not`: Inverts the Boolean value


Everyday you use similar logic regarding the weather and the clothing that you wear: 

```
is_snowing = True
is_summer = False

if is_snowing:
    put_on_parka = True
elif not is_summer:
    put_on_windbreaker = True
```

### 2) Control flow

In practice, logic is rarely limited to a single `if` statement. Control flow provides a pathway for the software to execute the logic chain that delivers the desired results.  

The building blocks for control flow are: 

$~~~~~~~~$ `if`, `elif`, and `else` statements (basic logic)  
$~~~~~~~~$ `for` loops (allow iterative operations)  
$~~~~~~~~$ `while` loops (continuously run an operation until some condition is met)  

Here is a (boring!) example as related to the weather: 

```
temperature = 10  # degrees Celsius
is_raining = True

# Obvious instructions for a human based on the weather
if temperature < 0:
    print("It's freezing! Watch out for ice.")
elif temperature < 15:
    if is_raining:
        print("It's cold and rainy. Wear a waterproof jacket.")
    else:
        print("It's chilly but dry. A sweater should be enough.")
elif temperature < 25:
    if is_raining:
        print("Warm but wet. Maybe carry an umbrella.")
    else:
        print("Nice weather! Perfect for outdoor experiments.")
elif not is_raining:
    print("It's hot! Stay hydrated and avoid direct sunlight.")
```

(For a less boring example imagine instrument control. You do not want to turn on your laser or open you telescope dome unless the conditions are absolutely correct, the results otherwise could be quite catastrophic! ... ask me later why one of the CCDs on the Dark Energy Camera does not work)

### 3) What's your type?

Everything in Python is an object, and objects have types.

(we will come back to what it means to be an object, and object oriented programming, later this week)

  *  The data type determines what operations you can perform.
  *  The type affects performance and memory usage.
  *  The type helps organize data logically for scientific problems.

The data types in python are familiar

(even if you have never thought about any of them in this way!)

There are **numbers**, which can be  

$~~~~~~~~$ `int`   
$~~~~~~~~$ `float`    
$~~~~~~~~$ `complex`  

If you are storing data or making calculations you'll want numbers.

There are also **strings** (`str`), which are typically used to label data or files, and which are also used to store metadata. 

**Lists** are flexible ordered collections typically used for small amounts of data or *mixed* data types. 

*Example*: `['today', 'I', 'ate', 11, 'bananas']`

(as your data sets become large lists become less effective and powerful)

**Tuples** are immutable sequences. They are particularly good for fixed data. 

For example, the coordinates of a star is represented by its right ascention and declination, which can be stored in a tuple: `coords = (175.2543, 23.6193)`.

**Dictionaries** are key-value pairs. They are not ordered. They are especially useful for mixed data types and effectively map values to their associated name. 

*Example* –  
`chicago_diet = {"favorite food": "pizza", "favorite vegetable": float('nan')}`

In [3]:
chicago_diet = {"favorite food": "pizza", "favorite vegetable": float('nan')}

print(chicago_diet['favorite food'])

pizza


**Sets** contain unique elements which can be useful for identifying/removing duplicates within a data set. 

*Example* – {1, 2, 3}

(now, let us never speak of `python` sets again $\longrightarrow$ there are tools in numpy that can handle this a lot more efficiently)

I think it is straightforward to imagine how you will use numbers and strings within python. We will now work through a couple examples to better understand lists, tuples, and dictionaries. 

### 4) String formatting

Manipulating strings is extremely important when analyzing text data, and also when producing output meant to be processed by humans. 

Consider the following (fake) observing log and the following operations that are useful for extracting information from it. 

*Note* – the triple quote `"""` allows one to create a multiline string in python

In [23]:

obs_log = """
# Instrument log — Night 2025-08-14
OBJECT = 'M31'           / Andromeda Galaxy
RA     = '00:42:44.3'    / hours:minutes:seconds
DEC    = '+41:16:09'     / degrees:arcminutes:arcseconds
EXPTIME= 1200            / seconds
OBSERVER= 'A. Miller'
NOTES  = 'Thin cirrus; seeing ~1.2"'
REDSHIFT  =  −0.001004
LINES  =
    Halpha 6562.80 Å  Int=120  SNR=58
    [OIII] 5006.84 Å  Int=89   SNR=42
    Na D   5891.58 Å  Int=34   SNR=12
"""
print(obs_log)



# Instrument log — Night 2025-08-14
OBJECT = 'M31'           / Andromeda Galaxy
RA     = '00:42:44.3'    / hours:minutes:seconds
DEC    = '+41:16:09'     / degrees:arcminutes:arcseconds
EXPTIME= 1200            / seconds
OBSERVER= 'A. Miller'
NOTES  = 'Thin cirrus; seeing ~1.2"'
REDSHIFT  =  −0.001004
LINES  =
    Halpha 6562.80 Å  Int=120  SNR=58
    [OIII] 5006.84 Å  Int=89   SNR=42
    Na D   5891.58 Å  Int=34   SNR=12



strings are indexed, and as a result they can be sliced:

In [24]:
s = "Halpha 6562.80 Å  Int=120  SNR=58"
print(s[0])        # first character
print(s[-1])       # last character
print(s[:7])       # up to (not including) index 7
print(s[8:16])     # a middle slice
print(s[::-1])     # reversed


H
8
Halpha 
562.80 Å
85=RNS  021=tnI  Å 08.2656 ahplaH


whitespace can easily be removed from strings:

In [25]:

line = "   \tM31, Andromeda Galaxy   \n"
print(repr(line.strip()))       # remove leading & trailing whitespace
print(repr(line.lstrip()))      # remove only leading whitespace
print(repr(line.rstrip()))      # remove only trailing whitespace

# Normalize multiple internal spaces to a single space:
messy = "Halpha    6562.80      Å"
clean = " ".join(messy.split())
print(clean)


'M31, Andromeda Galaxy'
'M31, Andromeda Galaxy   \n'
'   \tM31, Andromeda Galaxy'
Halpha 6562.80 Å


strings can be split and also joined: (e.g., searching for a specific keyword and the entry that always comes after it)

In [26]:
sodium = "Na D,5891.58,Int=34,SNR=12"
parts = sodium.split(",")
print(parts)

# Join tokens back:
rejoined = " | ".join(parts)
print(rejoined)


['Na D', '5891.58', 'Int=34', 'SNR=12']
Na D | 5891.58 | Int=34 | SNR=12


if no arguments are given to split, then the string is split on whitespace (which is super useful for "real" input text): 

In [27]:
print("Halpha    6562.80    Å".split()) 

['Halpha', '6562.80', 'Å']


membership within a string can be tested, with find/replace and counting also possible

In [28]:
s = "Int=120  SNR=58"
print("SNR" in s)           # membership
print(s.find("SNR"))        # index (or -1 if not found)
print(s.replace("Int", "Intensity"))
print(s.count("0"))         # count occurrences
print(s.startswith("Int"))
print(s.endswith("58"))


True
9
Intensity=120  SNR=58
1
True
True


case handling allows input/outputs to be normalized: (e.g., logging is not done consistently)

In [29]:
label = "hALpHa"
print(label.lower())
print(label.upper())
print(label.casefold())  # best for case-insensitive comparisons
print(label.lower() == "halpha")


halpha
HALPHA
halpha
True


by default computers do not format numbers cleanly $\longrightarrow$ string formatting provides far more readable text:

In [32]:
signal = 467
noise = 12.4
print(signal/noise)

37.66129032258064


(that's a lot of decimals) format specs improve this

In [37]:
print(f"{signal/noise:.3f}")

37.661


The previous example also uses an `f` string, which allows the placement of python variables directly within a string block. (This dramatically improves readability over older versions of string formatting)

In [40]:
wavelength = 6562.8
print(f"Hα at λ = {wavelength:.2f} Å (SNR={int(signal/noise):d})") # SNR as an integer
print(f"Scientific notation: {wavelength:.3e}")
print(f"Right-aligned width 12: |{wavelength:12.2f}|") # useful for fixed-width tables
print(f"Centered label: |{'Halpha':^10}|") # useful for tables


Hα at λ = 6562.80 Å (SNR=37)
Scientific notation: 6.563e+03
Right-aligned width 12: |     6562.80|
Centered label: |  Halpha  |


### 5) The utility of dictionaries

Each key-value pair within a dictionary is called an item. Every dictionary has methods to access the `.keys()`, `.values()`, and `.items()`.

In [17]:
constants = {'c':2.9979e10, 'h':6.626e-27, 'G':6.673e-8}
for key, value in constants.items(): 
    print(f'{key} has a value of {value}')

c has a value of 29979000000.0
h has a value of 6.626e-27
G has a value of 6.673e-08


The `.get()` method can be used to retrieve values from a dictionary, similar to `dictionary[key]`, but `.get()` will return a default if the key is not present. 

In [20]:
print(constants.get('me', float('nan')))
print(constants['me'])

nan


KeyError: 'me'

Dictionaries can be particularly useful for storing metadata, especially if it is not structured and may be different for every entry. Dictionaries can also be nested. 

In [22]:
faculty = { 'andre':{'level': 'full', 'office': 'tech'}, 
            'adam':{'level': 'assistant', 'office': '1800 sherman', 
                    'favorite_animal': 'red panda'}
          }

print(f"Adam's favorite animal is {faculty['adam']['favorite_animal']}")

Adam's favorite animal is red panda


As a highly practical example, dictionaries are extremely useful in plotting, especially when dealing with comparisons of categorical data.

```
plot_params = {'positive': {'color': 'red', 'linestyle': '-', 'markersize': 12},
               'negative': {'color': 'green', 'linestyle': '-.', 'markersize': 7}
                }
```

### 6) List comprehension

List comprehension allows lists to be quickly created or maniupated within python. 

It is useful because for loops are typically slow in python. List comprehension can fast track this, typically with a single line of code. 


As an overly simple example, suppose you wanted to cube every value within a list, list comprehension can do this quickly:  

`nums = [2, 3, 5, 7, 11]`  
`cube = [i**3 for i in nums]`

(if you ever really needed to do this, please just use `numpy`...)

In [8]:
nums = [2, 3, 5, 7, 11]  
cube = [i**3 for i in nums]
print(cube)

[8, 27, 125, 343, 1331]


List comprehension can handle more complex logic.  

Suppose you wanted to store the square of a bunch of numbers but only if they have a value greater than 1. 

`nums = [0.5, 1.2, 0.998, 0.12, 6.4, 8]`  
`min_square = [i**2 if i > 1 else 1 for i in nums]`

(yes this is an arbitrary example that is not immediately connected to reality)

In [9]:
nums = [0.5, 1.2, 0.998, 0.12, 6.4, 8]
min_square = [i**2 if i > 1 else 1 for i in nums]
print(min_square)

[1, 1.44, 1, 1, 40.96000000000001, 64]


As a more practical/useful example, list comprehension can be used to quickly calculate Taylor expansions with help from the built-in [`range()`](https://docs.python.org/3/library/functions.html#func-range) function.

**Challenge problem** Estimate the value of 1/0.627 using a taylor expansion with 5 terms. How does this compare to 11 terms, and 101 terms? 

*Hint* – the taylor expansion for $\frac{1}{1-x}$ is $\sum_{n=0}^n x^n$.

In [10]:
# complete

### 7) Indexing, slicing, and iterating over lists

Data are often organized in regular structures (e.g., lists; later we will also discuss arrays and tables) and indexing, slicing, and iterating over these structures provide fast and efficient methods for data handling.

It is possible to access specific elements:

In [41]:
times = [0.0, 1.2, 2.4, 3.6, 4.8]  # times in seconds
print("Start time:", times[0])
print("End time:", times[-1])


Start time: 0.0
End time: 4.8


Slicing allows multiple elements to be selected at once (this need not be limited to consecutive entries):

In [43]:
time = [0, 1, 2, 3, 4, 5]  # seconds
velocity = [0, 2, 4, 6, 8, 10]  # m/s

# Get velocities from t=2 to t=4
print("Selected velocities:", velocity[2:5])
print("Selected velocities:", velocity[2:5:2])

Selected velocities: [4, 6, 8]
Selected velocities: [4, 8]


Data structures can also be reversed: 

In [44]:
print("Reversed:", times[::-1])

Reversed: [4.8, 3.6, 2.4, 1.2, 0.0]


Iteration (typically done with a `for` loop) is an especially powerful tool that allows manipulation of every element within the data structure: 

In [45]:
energies = [5.0, 10.0, 15.0, 20.0]  # in MeV
for energy in energies:
    print(f"Energy: {energy} MeV")

Energy: 5.0 MeV
Energy: 10.0 MeV
Energy: 15.0 MeV
Energy: 20.0 MeV


Python has built-ins that allow enumeration (useful for indexing) and zipping (useful for combining) for loops: 

In [46]:
for i, en in enumerate(energies):
    print(f"Time {i}s: Energy = {en} MeV")


Time 0s: Energy = 5.0 MeV
Time 1s: Energy = 10.0 MeV
Time 2s: Energy = 15.0 MeV
Time 3s: Energy = 20.0 MeV


In [49]:
mass = [1.0, 2.0, 3.0]  # kg
acceleration = [9.8, 4.9, 3.266]  # m/s²

for m, a in zip(mass, acceleration):
    force = m * a
    print(f"Mass: {m} kg, Acceleration: {a} m/s², Force: {force:.2f} N")


Mass: 1.0 kg, Acceleration: 9.8 m/s², Force: 9.80 N
Mass: 2.0 kg, Acceleration: 4.9 m/s², Force: 9.80 N
Mass: 3.0 kg, Acceleration: 3.266 m/s², Force: 9.80 N


### Conclusions

This introduction to many of the basic building blocks for python has focused on methods and tricks that are especially important for scientific computing. 

We have discussed many "basics" that are not always explained (especially for those that are self-taught!)

Using these basic tools it possible to get incredibly far (though we will demonstrate throughout this week that there are many many tools that provide significant improvements/speed ups relative to the "basics").