## What is this project about?

We use state-of-the-art Bayesian causal modeling tools ([ChiRho](https://github.com/BasisResearch/chirho)) to investigate the role of parking zoning reform in Minneapolis on the development of new housing units, at a relatively fine-grained level of census tracts. Minneapolis is an example of a city which somewhat sucessfuly navigates the housing crisis, and a parking zoning reform has been claimed to be connected to this outcome (see for example [here](https://reason.com/2024/02/27/fear-loathing-and-zoning-reform-in-minnesota/) and [here](https://www.strongtowns.org/journal/2023/9/15/ending-minimum-parking-requirements-was-a-policy-win-for-the-twin-cities)).


%TODO Someone should perhaps check if there are better links to include here

Whether this is so, to what extent and with what uncertainty has been unclear. Yes, the number of housing units in the city increased faster after the reform. But it is not ovious whether this isn't a mere correlation arising from other variables being causally responsible, or random variation. We decided to take a deep dive and connect detailed census tracts data with demographic variables within a carefully devised causal model to investigate. Due to data availability limitations, we start at year 2010. Since a major world-wide event changed too many things in 2020, this is where our data collection stops, to be able to separate the zoning concerns from the complex and unprecedented events that follow. It turns out that even with 10 years of data only, causal modelling allows us to offer some (admittedly, uncertain) answers.

## Why this is not a typical machine learning project

A typical predictive project in machine learning tends to use as much data as possible and algorithms to identify patters, focusing only on predictive accuracy. While such an approach is useful, the key limitation is that such models have a hard time distinguishing accidental correlations from causal connections, and therefore are not realiable guides to counterfactual predictions and causal effect estimation. Moreover, a typical model often disregards information that humans use heavily: temporal, spatial or causal structures, which are needed to generalize well outside the training data.

Instead, we use our core open source technology, [ChiRho](https://github.com/BasisResearch/chirho) to build **bayesian causal models** using hand-picked relevant variables. This way, we can work with humans and in the loop. The fact that we use Bayesian methods, allows for the injection of human understanding of the causal dependecies, which then are made work in symbiosis with the data, even if the latter is somewhat limited, and for honest assessment of the resulting uncertainties. The fact that the models is causal gives us a chance to address counterfactual queries involving alternative interventions.




## Why care about different types of questions?

Once we start thinking in causal terms, there are **multiple types of queries** that we can distinguish and answer using the model, and such questions typically have different answers. While assosciative information is often useful or revealing, equally often we wwant to be able to evaluate potential consequences of acting one way or another, and in this mode of reflection, we rather turn to thinking in terms of interventions and counterfactuals.

- *Association*. Example: Is there a correlation between increased green spaces and decreased crime rate in an area? Perhaps, areas with more green spaces do tend to have lower crime rates for various reasons.

- *Intervention* If the city implements a zoning change to create more green spaces, how would this impact the crime rate in the area? The answer might differ here: factors other than the policy change probably influence crime rates to a large extent.

- *Counterfactual* Suppose you did create more green spaces and the crime rate in the area did go down. Are you to be thanked? This depends on whether the crime rate would have gone down had you not created more green space in the area. Would it?





## Counterfactual modeling of the zoning reform

In the case at hand, we allow you, the user, to investigate predicted counterfactual outcomes of a zoning reform, specifed in terms of where the two zones start, what parking limits are to be imposed in different zones, and what year the reform has been introduced. From among the available variables we hand-picked the ones that are most useful and meaningfully causally connected. The model simultaneously learns the strenghts of over 30 causal connections and uses this information to inform its counterfactual predictions. The structural assumptions we have made at a high level can be described by the diagram below. However, a moderately competent user can use our [open source codebase](https://github.com/BasisResearch/cities) to tweak or modify these assumptions and invesigate the consequences of doing so.
    

<img src="tracts_dag_plot_high_density.png" alt="DAG Plot" width="800"/>


## How does the model perform?

The causal layer, nevertheless, should not take place at the cost of predictive power. The models went through a battery of tests on split data, each time being able to account for around 25-30% variation in the data (which for such noisy problems is fairly decent peformance), effectively on average improving predictions of new housing units appearing in each of census tracts at each of a given years by the count of 35-40 over a null model. A detailed notebook with model testing is also available at our open source codebase. 