# Python Scripts & Program Execution

When should we use scripts? Jupyter notebooks encourage tweaking and exploration, while this is an important part of the development process, we often get to a point where we just want to run the notebook from start to finish i.e. to perform some data processing. When the notebook is operating more as code we want to execute sequentially and repeatedly it's time to switch to scripts.

What is a Python script? A Python script is a standalone file (.py) designed for sequential execution. Unlike modules, which are meant for organizing reusable code, scripts are executed directly. Scripts offer better maintainability, version control, performance, and automation, making them ideal for reproducible workflows and deployment.  

What happens when you run a script? When a script runs, Python parses the code, compiles it to bytecode, and executes it within the Python Virtual Machine (PVM). This structured approach avoids issues like hidden states in notebooks and ensures predictable execution.

## 00. Getting Setup

In [1]:
!pip install --upgrade pip
!pip install -r ../requirements.txt



## 01. Python Execution

In the previous notebook we covered what Python can "see" when it's trying to import something. Another important aspect of executing Python code is where this code is being run, we call this the "working directory".

What is the working directory? The working directory is the directory where Python looks for files by default, searches for modules, reads and writes files, etc. is the filesystem location where the Python code is running from.

In [2]:
import os
print(os.getcwd())

c:\Users\can134\OneDrive - CSIRO\Documents\CSIRO\NextGen\2025\coding-bootcamp\nextgen2025-codingbootcamp-session06\notebooks


In [3]:
from pathlib import Path
print(Path(".").absolute().resolve())

C:\Users\can134\OneDrive - CSIRO\Documents\CSIRO\NextGen\2025\coding-bootcamp\nextgen2025-codingbootcamp-session06\notebooks


In [4]:
# find items in current directory with `pathlib`

In [5]:
# "." defaults to our current working directory, we can then make changes relative to this

Whenever we run code, each Python module will also add some extra variables to each module called double underscore attributes, or dunder for short. These allow Python and us to figure out what the name of a file is an if it's the main code being run and provide global information.

In [None]:
# explore some __dunder__ attributes

## 02. Python Scripts

A Python script is just a Python module aka. a file containing some Python code, typically with a `.py` extension, that is designed to be executed as a standalone piece of code rather than as an organizational tool. Let's take a look at running some scripts.

Typically we would run scripts from the commandline by typing `python <path>/<to>/<script>.py`, using `!` we can also do this from inside a notebook with some caveats.

In [9]:
!python ../scripts/script.py

hello world


In [10]:
# explore some different aspects of scripts e.g. __dunder__ methods and their execution

## 03. Scripts as Jobs

We essentially want to treat scripts as blocks of code to be executed in a repeatable manner, this significantly improves the reproducibility of our work.

Let's create a script which downloads the dataset, creates an interface for the dataset, and does some processing to it.

In [33]:
!python ../scripts/script_with_dataset.py

0 has label 9 with img of shape (28, 28, 3)
1 has label 2 with img of shape (28, 28, 3)
2 has label 1 with img of shape (28, 28, 3)
3 has label 1 with img of shape (28, 28, 3)
4 has label 6 with img of shape (28, 28, 3)
5 has label 1 with img of shape (28, 28, 3)
6 has label 4 with img of shape (28, 28, 3)
7 has label 6 with img of shape (28, 28, 3)
8 has label 5 with img of shape (28, 28, 3)
9 has label 7 with img of shape (28, 28, 3)
