## Chapter 5: Organizing code

Getting code to run is only the beginning. As your projects grow, keeping things organized becomes just as important as making them work. Without structure, your code can quickly turn into spaghetti code—a tangled mess that’s hard to understand, reuse, or build on.

Good organization is key to writing code that lasts. Whether you’re working solo or with others, how you structure your code affects how easily it can be read, maintained, and improved.

In this chapter, we’ll explore how functions, classes, and modular programming bring clarity and structure to your code, helping you build programs that are both robust and scalable.

## Functions

A function is a reusable block of code that performs a specific task. It allows you to group related instructions under a single name, so you can run them whenever you need—without repeating yourself. Functions help make your code more organized, readable, and easier to maintain.

The syntax of a function is as follows:

```python
def function_name(arg_1, arg_2, ...., arg_n):
    statements
    return val
```
`def` is the header of the function, it generates the function object and assigns
a name to it. In the parentheses, the input parameters are included. When the function does not have any input parameter, then the parentheses is left empty. After the colon (`:`), the function statements are included. The `return` statement returns a value. If `val` is not specified, the function returns None. Both `val` and the `return` statement are optional.

Let’s turn the code that reads data from Factpages into a function:

In [None]:
# import libraries required for the notebook
import os # import os to work with directories
import requests # import requests (for HTTP requests)
from io import StringIO # import StringIO (for reading the data)
import numpy as np # import numpy as np
import pandas as pd # import pandas as pd
import matplotlib.pyplot as plt # import matplotlib.pyplot as plt
from scipy import integrate # import scipy integrate module 

In [None]:
def read_factpages(descriptor):
    """
    Read data from NOD factpages
    Input:
        descriptor: String with NOD descriptor, 
        e.g. "field_production_monthly"
    Output:
        df: DataFrame with the data or
        empty if data could not be read
    """
    # construct the URL
    u_1 = "https://factpages.sodir.no/public?/Factpages/external/tableview/"
    u_2 = "&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f"
    u_3 = "&IpAddress=not_used&CultureCode=en&rs:Format=CSV&Top100=false"
    url = u_1 + descriptor + u_2 + u_3

    # create an empty DataFrame
    df = pd.DataFrame()

    # request the data
    response = requests.get(url)
    # if the request was successful
    if response.status_code == 200:
        # load csv data into a DataFrame
        df = pd.read_csv(StringIO(response.text))
    else:
        print(f"Error: {response.status_code}")
    
    return df

The text enclosed in triple quotes is called a *docstring*—it describes what the
function does. Including a clear and concise docstring is always a good habit,
as it helps others (and your future self) understand the purpose of the function.
In VS Code, this information can be accessed anytime using IntelliSense.

Now we can use the function to read data:

In [None]:
# monthly field production
df = read_factpages("field_production_monthly")
df.info()

By accepting a `descriptor` as input, the function becomes flexible and reusable—it can read any dataset from Factpages, as long as we know the corresponding descriptor. For example:

In [None]:
# long list of all exploration wells
df = read_factpages("wellbore_exploration_all")
df.info()

Let’s try another example. The area of a polygon of any shape (except one that crosses itself) can be written as:

$$
A=\frac{1}{2} \sum_{i=1}^n\left(x_i y_{i+1}-x_{i+1} y_i\right) \text { if } i+1>n, i+1=1
$$

where $n$ is the number of points in the polygon, and ($x_i,y_i$) are the $x$ and
$y$ coordinates of the points. For example, for a four points polygon the area is:

$$A=\frac{1}{2}\left(x_1 y_2-x_2 y_1+x_2 y_3-x_3 y_2+x_3 y_4-x_4 y_3+x_4 y_1-x_1 y_4\right)$$

This equation returns a positive area if the points are ordered counterclockwise,or a negative area if the points are ordered clockwise. Let’simplement a function that computes the area of a polygon using this formula:

In [None]:
def polyg_area(x,y):
    """
    calculate and return the area of a polygon
    from the x and y coordinates of its points
    Note: points must be in sequential order
    """
    npoints = x.size # number of points
    area = 0.0 # initialize area
    
    # calculate polygon's area
    for i in range(npoints): 
        # index of the next point
        next_i = i + 1 
        if i == npoints-1: # if last point
            next_i = 0
        # calculate area 
        area += (x[i]*y[next_i] - y[i]*x[next_i])
    
    # return area
    return np.abs(area)/2 

Surprisingly, this function can be shortened to one line of code! To understand why the code below works, check the NumPy [roll](https://numpy.org/doc/2.2/reference/generated/numpy.roll.html) and [dot](https://numpy.org/doc/stable/reference/generated/numpy.dot.html) methods.

In [None]:
def polyg_area(x,y):
    """
    calculate and return the area of a polygon
    from the x and y coordinates of its points 
    Note: points must be in sequential order
    """
    return 0.5*np.abs(np.dot(x, np.roll(y,1))
                      - np.dot(y, np.roll(x,1)))

The file [net_oil.txt](../data/net_oil.txt) in the data directory, contains the x(east), y (north), and z (value) of contours in an isochore map (vertical thickness) of net oil in a trap. All values are in meters. Let’s compute the volume of the trap. We first
determine the area of the contours using our function:

In [None]:
path = os.path.join("..", "data", "net_oil.txt") 
contours = np.loadtxt(path) # read the contours

c_values = np.unique(contours[:,2]) # contour values
c_areas = np.zeros(c_values.size) # initialize areas

for i in range(c_values.size):
    # extract contour
    contour = contours[contours[:,2] == c_values[i]] 
    # calculate area
    c_areas[i] = polyg_area(contour[:,0],contour[:,1])
    # plot contour
    plt.plot(contour[:,0],contour[:,1],".",markersize=2)

print("Areas in m2 = ", c_areas.astype(int)) # print areas
plt.axis("equal") # set equal axis
plt.show() # show plot

In the code above, we also plotted the contours using the [Matplotlib](https://matplotlib.org) library. While this will be covered in more detail in Chapter 6, for now just notice how straightforward it is to visualize data in Python.

Mathematically, the volume of the trap can be expressed as:

$$V=\int_a^b A(z) d z$$

To understand why this works, imagine slicing the volume into many horizontal layers—these are the contours. We can estimate the volume between each pair of adjacent contours and then sum them all to obtain the total volume.

So, to calculate the volume, we need to perform an integration. We can do this using the [scipy.integrate](https://docs.scipy.org/doc/scipy/tutorial/integrate.html) module. In the code below, we use the trapezoidal rule—implemented as the `trapezoid()` method—to carry out the integration:

In [None]:
v_oil = integrate.trapezoid(c_areas, c_values) # volume of oil
print(f"Volume of oil: {v_oil:.0f} m3 or {v_oil*6.2898:.0f} bbl") 

## Classes

Object-oriented design underpins many of Python’s libraries and tools. Python is fundamentally an object-oriented programming (OOP) language. It organizes code around objects—structures that combine data with the functions (called methods) that operate on that data. The two main building blocks of OOP are classes and objects.

A class is a blueprint for creating objects. It defines the shared structure and behavior its instances will have. For example, an `Employee` class might include attributes like `name` and `salary`, and a method like `raise_salary`. From this class, you can create multiple employee objects, each with their own data but the same general behavior.

Python makes working with classes simple. Here’s a basic class definition:

```python
class ClassName
    def __init__(self, parameters):
        # initialization code
        self.attribute = value
    
    def method(self):
        # method code
```
Let’s look at a simple example that defines a `Circle` class:

In [None]:
class Circle:
    """
    A class that implements a circle
    """
    # initialization requires center [x, y]
    # and radius of circle    
    def __init__(self, center, radius):
        self.center = center
        self.radius = radius
    
    # methods
    
    # circumference
    def circumference(self):
        return 2 * np.pi * self.radius
    
    # area
    def area(self):
        return np.pi * self.radius ** 2
    
    # x and y coordinates defining circle
    def coordinates(self):
        theta = np.arange(0,360) * np.pi / 180
        x = self.radius * np.cos(theta) + self.center[0]
        y = self.radius * np.sin(theta) + self.center[1]
        return x, y
    
    # shift center in x
    def shift_in_x(self, x_value):
        self.center[0] += x_value
    
    # shift center in y
    def shift_in_y(self, y_value):
        self.center[1] += y_value

Now let’s use this class to fill a 20 × 20 unit square with circles of radius 1.
We’ll also calculate the areal porosity, which measures the fraction of the area
not occupied by the circles.

In [None]:
area_circles = 0.0 # initialize circles area
my_circle = Circle([-1, -1],1) # unit circle with center -1,-1

# use two nested loops
# i moves circle in y
# j moves circle in x
for i in range(10):
    my_circle.center[0] = -1 # reset x of circle center to -1
    my_circle.shift_in_y(2) # shift circle in y 2 units
    for j in range(10):
        my_circle.shift_in_x(2) # shift circle in x 2 units
        area_circles += my_circle.area() # update circles area
        x, y = my_circle.coordinates() # circle coordinates
        plt.plot(x,y,'r-') # plot circle

plt.axis("square") # square axis
plt.xlim([0, 20]) # x axis limits
plt.ylim([0, 20]) # y axis limits

# estimate and print areal porosity
area_total = 20 * 20
area_voids = area_total - area_circles
print(f"Areal porosity = {area_voids/area_total:.2f}")

Note that only a single `Circle` instance (`my_circle`) is created in the second line of the code. This circle is then shifted in the `y` (`shift_in_y()`) and `x` (`shift_in_x()`) directions within two nested loops to fill the square. In each iteration, the circle area is calculated using the `area()` method and added to the total. The coordinates of points along the circumference are generated using the `coordinates()` method and plotted with the Matplotlib `plot()` function.

Let’s look at another example. Building on our code to read data from Factpages, we’ll now define a top-level class responsible for setting up the main URL components and loading data from a descriptor.

In [None]:
class FP_reader:
    """
    Class to read NOD factpages
    """
    def __init__(self):
        """
        initialize strings to construct 
        the URL as of May 2025
        """
        self.u_1 = "https://factpages.sodir.no/public?/Factpages/external/tableview/"
        self.u_2 = "&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f"
        self.u_3 = "&IpAddress=not_used&CultureCode=en&rs:Format=CSV&Top100=false"
    
    def read(self, descriptor):
        """
        Read data from NOD factpages
        Input:
            descriptor: NOD descriptor, 
            e.g. "field_production_monthly"
        Output:
            df: DataFrame with the data or
            empty if data could not be read
        """
        # construct the URL
        url = self.u_1 + descriptor + self.u_2 + self.u_3

        # request the data
        response = requests.get(url)
        # if the request was successful
        if response.status_code == 200:
            # load csv data into a DataFrame
            self.df = pd.read_csv(StringIO(response.text))
        else:
            print(f"Error: {response.status_code}")
        
        return self.df

Next, we’ll create a class that inherits from the top-level class and is responsible for reading data from specific fields.

In [None]:
class Field(FP_reader):
    """ 
    Class to read field data from NOD factpages 
    """
    def __init__(self):
        """
        Initialize field class
        """
        # call parent class
        super().__init__()

    def monthly_production(self, fields=[]):
        """
        Read field production monthly data
        Input:
            fields: list of field names. Pass
            empty list to read all fields
        Output:
            df: DataFrame with the data or
            empty if data could not be read
        """
        # call parent class method
        df = self.read("field_production_monthly")
        # if df and fields are not empty
        if not df.empty and len(fields) > 0:
            # filter by field names
            df = df[df["prfInformationCarrier"].isin(fields)]

        return df          

Note that the `Field` class currently has just one method, but it’s easy to extend the class by adding more methods to read additional datasets from the *Field* category in Factpages. Let’s read the monthly production from ConocoPhillips fields using the class:

In [None]:
# read monthly production data
# of ConocoPhillips fields
fields = ["EKOFISK", "ELDFISK", "TOMMELITEN GAMMA", 
          "TOR", "VEST EKOFISK", "ALBUSKJELL", "VALHALL", 
          "HOD", "TOMMELITEN A"]

field = Field() # create field object
df = field.monthly_production(fields) # read data
df.info() # print info

## Modular programming

The functions and classes we’ve created so far are only available within the current notebook. Wouldn’t it be convenient to store them in a way that allows us to easily import and use them in any other file when needed? In Python, we can achieve this by organizing them into *modules* and *packages*.

A module is a Python source file (`.py`) that contains code designed to perform a specific task. For example, the first Python file we created in this course,`my_first_code.py`, is actually a module. A package is a directory that contains modules or data, along with an initialization file (`__init__.py`), which signals to Python that the directory should be treated as a package.

To illustrate this, we have created a package called factpages to handle reading data from Factpages. In your code directory, look at the folder called factpages. Within this folder, there are three files: __init__.py, reader.py and field.py.

- The [reader.py](factpages/reader.py) file has the code we wrote before to create the `FP_reader` class.

- The [field.py](factpages/field.py) file has the code we previously wrote to define the `Field` class. Additional functions for working with the *Field* category of Factpages are also included. Note that the `field` module imports the class `FP_reader` from
the `reader` module. This is necessary since `Field` inherits from `FP_reader`.

- We want to be able to call the functions directly from the package. To achieve this, in the [__init__.py](factpages/__init__.py) file we import all functions from the modules into the main package namespace, making them accessible directly from the package.

Now we can use our package. Before running the cell below in the chapter’s notebook, make sure to clear all outputs and restart the kernel. This ensures that we’re starting with a clean slate.

In [None]:
import factpages as fp # import our package

# read monthly production data
# of ConocoPhillips fields
fields = ["EKOFISK", "ELDFISK", "TOMMELITEN GAMMA", 
          "TOR", "VEST EKOFISK", "ALBUSKJELL", "VALHALL", 
          "HOD", "TOMMELITEN A"]
field = fp.Field() # create field object
df = field.monthly_production(fields) # read monthly production data
df.info() # print info

And we can try with another method:

In [None]:
df = field.investments(fields) # read investments data
df.info() # print info

I hope this gives you a sense of the power behind structuring your code into modules and packages. While we’ve only scratched the surface, I hope it inspires you to explore more about modular programming.

## Exercise

Expand the functionality of the factpages package by adding a module to work with the *Wellbore* category. This module should have functions for extracting:

- The current year exploration wellbores: descriptor: `wellbore_exploration_current_year`

- The last year exploration wellbores: descriptor: `wellbore_exploration_last_year`

- The last 10 years exploration wellbores: descriptor: `wellbore_exploration_last_10_years`

- Short list of all exploration wellbores: descriptor: `wellbore_exploration_all_short`

- Long list of all exploration wellbores: descriptor: `wellbore_exploration_all`

- Development wellbores: descriptor: `wellbore_development_all`

- Other wellbores: descriptor: `wellbore_other_all`

- CO2 storage wellbores: descriptor: `wellbore_co2_storage`

All these functions should allow filtering the output by fields (`wlbField` column) and completion year (`wlbCompletionYear` column).
