# FPM: Funding-Productivity Model

This notebook is a simple model to estimate the funding required to achieve a certain level of productivity in a research lab. The model is based on the assumption that the productivity of a lab is proportional to the funding it receives. 

The model is based on the following assumptions:

## Data Formatting

The data is formatted as a table with the following columns:

- `Year`: The year of the data
- `Funding`: The funding received by the lab in that year
- `Productivity`: The productivity of the lab in that year
- `Target Productivity`: The target productivity of the lab


## Key Assumptions

1. **Productivity is proportional to funding**: The productivity of the lab is assumed to be proportional to the funding it receives. This is a simplifying assumption, but it is a common one in the literature on research funding.
1. **Diminishing returns to funding**: The relationship between funding and productivity is assumed to be concave, meaning that the marginal productivity of funding decreases as funding increases. This is also a common assumption in the literature on research funding.
1. **Lag between funding and productivity**: The effect of funding on productivity is assumed to be lagged, meaning that funding in one year affects productivity in the following year. This is a common assumption in the literature on research funding.

In [9]:
# Install the semopy library if not already installed
%pip install semopy

# Import the necessary libraries
import semopy
import pandas as pd
import numpy as np

# Define your SEM model using the `semopy` syntax
model_desc = """
    # Measurement model
    Latent1 =~ Observed1 + Observed2
    Latent2 =~ Observed3 + Observed4

    # Regression structure
    Latent2 ~ Latent1
"""

data = pd.DataFrame({
    'Observed1': np.random.normal(size=100),
    'Observed2': np.random.normal(size=100),
    'Observed3': np.random.normal(size=100),
    'Observed4': np.random.normal(size=100)
})

# Set up the SEM model
model = semopy.Model(model_desc)

# Fit the model
model.fit(data)

# Get the summary of the model results
summary = model.inspect()
print(summary)


Note: you may need to restart the kernel to use updated packages.
         lval  op       rval      Estimate   Std. Err   z-value   p-value
0     Latent2   ~    Latent1 -1.572642e+00   6.890117 -0.228246  0.819455
1   Observed1   ~    Latent1  1.000000e+00          -         -         -
2   Observed2   ~    Latent1 -3.625009e-01   0.796769 -0.454964  0.649135
3   Observed3   ~    Latent2  1.000000e+00          -         -         -
4   Observed4   ~    Latent2  7.594050e-02   0.963756  0.078796  0.937195
5     Latent1  ~~    Latent1  8.044758e-02   0.361783  0.222364  0.824031
6     Latent2  ~~    Latent2  6.569464e-01  10.814583  0.060746  0.951561
7   Observed1  ~~  Observed1  1.330150e+00   0.402323  3.306178  0.000946
8   Observed2  ~~  Observed2  1.012405e+00    0.15061  6.722047       0.0
9   Observed3  ~~  Observed3  1.122502e-18  10.778355       0.0       1.0
10  Observed4  ~~  Observed4  1.223853e+00   0.183902  6.654916       0.0


In [None]:
import pandas as pd

# Read the source data from a file or any other source
source_data = pd.read_csv('path/to/source_data.csv')

# Select the required fields from the source data
required_data = source_data[['Field1', 'Field2', 'Field3']]

# Remove any rows with missing values in the required fields
required_data = required_data.dropna(subset=['Field1', 'Field2', 'Field3'])

# Perform any additional data cleaning or transformations as needed

# Print the prepared data
print(required_data)
