# Scientific Python: part 2

<br><br><br>

**The cost of convenience**

<br><br><br>

The main performance distinction among languages is compiled versus dynamic, and Python is one of the dynamic languages.

<center>
<img src="img/benchmark-games-2023.svg" width="1000pt">
</center>

## Performance characteristics of Python

Reload the Higgs dataset and get one list of numbers.

In [1]:
import json

In [3]:
dataset_python = json.load(open("data/SMHiggsToZZTo4L.json"))

In [4]:
pt_python = []
for event in dataset_python:
    for electron in event["electron"]:
        pt_python.append(electron["pt"])
    for muon in event["muon"]:
        pt_python.append(muon["pt"])

In [10]:
pt_python[:4]

[63.04386901855469, 38.12034606933594, 4.04868745803833, 21.902679443359375]

In [5]:
len(pt_python)

28809

How much memory is it using?

In [6]:
import sys

In [18]:
num_bytes = 0

num_bytes += sys.getsizeof(pt_python)

for x in pt_python:
    num_bytes += sys.getsizeof(x)
    
num_bytes

937904

How many bytes per value? More than 8?

In [19]:
num_bytes / len(pt_python)

32.55593738067965

Get the same data as an array.

In [20]:
import h5py

In [21]:
dataset_hdf5 = h5py.File("data/SMHiggsToZZTo4L.h5")

In [22]:
pt_numpy = dataset_hdf5["particles"]["pt"][:]

In [23]:
pt_numpy

array([63.04387  , 38.120346 ,  4.0486875, ..., 60.098644 ,  3.7663147,
       21.205685 ], dtype=float32)

In [27]:
sys.getsizeof(pt_numpy) / len(pt_numpy)

4.003887673990767

<center>
<img src="img/python-list-layout.svg" width="800pt">
</center>

<center>
<img src="img/python-array-layout.svg" width="800pt">
</center>