# Making Predictions: An Example Using the Beer Dataset

We will go through the proccess for predicting Y quantities with SIMCA-Q.

We will use a SIMCA project where two OPLS models have been built using the BEER dataset typically used in SIMCA courses. The picture below shows the initial observations and variables of the original dataset used in this project:

![Original Dataset](Images/Dataset1.png)

This SIMCA file is available from this [repository folder](https://github.com/OEM-Sartorius-Data-Analytics/SimcaQ_Python_Scripting_Guide/tree/main/06_PredictionInterface_1) as *BEER_NIR_alcohol_example.usp*.

The SIMCA project file has two OPLS models, both predicting the alcohol content:

![Models](Images/Models.png)

We will show in this example how to predict the alcohol content of a spectrum not used to build the model. This spectrum will be in a csv format with two rows, one for the variable names and one for the variable values. When opened in Excel it looks like:

![Spectrum](Images/InputFile.png)

This csv file is available from this repository folder as [predictionDataFile2.csv](predictionDataFile2.csv).

## Accessing the SIMCA-Q COM Interface

We will first need to access the SIMCA-Q COM interface. For this we will follow the [approach detailed in the guide](https://github.com/OEM-Sartorius-Data-Analytics/SimcaQ_Python_Scripting_Guide/blob/main/00_COM_and_License/COM_and_License.md)

In [1]:
def dispatch(app_name:str):
    try:
        from win32com import client
        app = client.gencache.EnsureDispatch(app_name)
    except AttributeError:
        # Corner case dependencies.
        import os
        import re
        import sys
        import shutil
        # Remove cache and try again.
        MODULE_LIST = [m.__name__ for m in sys.modules.values()]
        for module in MODULE_LIST:
            if re.match(r'win32com\.gen_py\..+', module):
                del sys.modules[module]
        shutil.rmtree(os.path.join(os.environ.get('LOCALAPPDATA'), 'Temp', 'gen_py'))
        from win32com import client
        app = client.gencache.EnsureDispatch(app_name)
    return app

In [2]:
simcaq = dispatch('Umetrics.SIMCAQ')

If we print the variable *simcaq* we can see that it is an *ISIMCAQ* object:

In [3]:
print(simcaq)

<win32com.gen_py.SIMCA-Q 17 Type Library.ISIMCAQ instance at 0x2770330105936>


## Opening the SIMCA Project

We can use the *ISIMCAQ* method *OpenProject()* to open the SIMCA Project *BEER_NIR_alcohol_example.usp*:

In [4]:
pathSimcaProject = 'BEER_NIR_alcohol_example.usp'
project = simcaq.OpenProject(pathSimcaProject, "")

If we print the variable *project* we can see that it is an *IProject* object:

In [5]:
print(project)

<win32com.gen_py.SIMCA-Q 17 Type Library.IProject instance at 0x2770329488448>


As [discussed](https://github.com/OEM-Sartorius-Data-Analytics/SimcaQ_Python_Scripting_Guide/blob/main/01_ProjectInterface/ExploreProjectInterface.md), we can retrieve different attributes of the project straigtht away from this object.

For instance, the name of the project:

In [6]:
projectName = project.GetProjectName() 
print(projectName)

BEER_NIR_alcohol_example


The number of models within the project:

In [7]:
numberModels = project.GetNumberOfModels()
print(numberModels)

2


The number of datasets:

In [8]:
numberDatasets = project.GetNumberOfDatasets()
print(numberDatasets)

2


## Accessing a Model

As [discussed](https://github.com/OEM-Sartorius-Data-Analytics/SimcaQ_Python_Scripting_Guide/blob/main/04_ModelInterface_0/ModelInterface_Introduction.md), we can access models by using the *IProject* method *GetModel()*, which receives as an input parameter.

In the example below we retrieve handles for all the models in the project and store them in a list