## Filtering Data Based on Criteria
In this lesson, we will use a simple, small dataset of weather projections in Chapel Hill for Thursday, March 25th, through Saturday, April 3rd, where each row is the projection for the next day in that timeframe.

Our analysis goal is to find the average temperatures on days where it is unlikely (less than 30%) to rain.

We will consider approaching this problem from a column-oriented perspective.

First, let's consider our data set.

In [None]:
col_data: dict[str, list[float]] = {
    "high": [77, 84, 78, 79, 65, 67, 74, 61, 55, 61],
    "low":  [67, 51, 64, 45, 43, 53, 56, 37, 34, 42],
    "rain": [.3, .2, .4, .8, 0., .2, .4, .5, .1, .1]
}

col_data

: 


# Produce a "Mask" Based On Criteria

In [None]:
def less_than(col: list[float], threshold: float) -> list[bool]:
    """Returns a True/False list if the items are numerically less than a given float"""
    result: list[bool] = []
    for item in col:
        result.append(item < threshold)
        # The above line, is the same as the following:
        # if item < threshold: 
        #     result.append(True)
        # else: 
        #     result.append(False)
    return result

# Example, testing call
no_rain_mask: list[bool] = less_than(col_data["rain"], 0.3)

: 

# Masked Function
Takes in a column and a list of masks (bool values), returns only the values in the input column where the corresponding mask is True

In [None]:
def masked(col: list[float], mask: list[bool]) -> list[float]:
    """Returns all of the true values in less_than."""
    result: list[float] = []
    for i in range(len(mask)):
        if mask[i]:
            result.append(col[i])
    return result

# Test call
highs_of_no_rain_days: list[float] = masked(col_data["high"], no_rain_mask)
print(highs_of_no_rain_days)

: 

# Compute the Average

In [None]:
def mean(col: list[float]) -> float:
    """Returns the mean of a column."""
    return sum(col)/len(col)

mean(highs_of_no_rain_days)

: 

# With these helper functions, we can perform many analyses!

In [None]:
cold_days_mask: list[bool] = less_than(col_data["low"], 50)
rain_on_cold_days: list[float] = masked(col_data["rain"], cold_days_mask)
print(mean(rain_on_cold_days))

: 

In [None]:
def not_mask(mask: list[bool]) -> list[bool]:
  result: list[bool] = []
  for item in mask:
    result.append(not item)
  return result

mask_a: list[bool] = less_than(col_data["high"], 80)
mask_b: list[bool] = not_mask(mask_a)

values: list[float] = masked(col_data["low"], mask_b)
print(mean(values))

: 