# Creating a Pipeline

As mentioned, a pipeline in this context is a machine learning pipeline consisting of different components responsible for different parts of the machine learning process. We have defined a few steps for creating the necessary components and connecting them to create the final pipeline. In order to complete these steps it is important that you have atleast a basic understanding of the different technologies and tools used and how they work. 

As mentioned, some of the technologies and tools used are a requirement in order to be able to deploy the components using the AI builder tools. Those tools and technologies include protobuffers, grpc and docker. 

Each component will consist of a grpc server which contains functions corresponding to the purpose of that component, such as data cleaning or model creation etc. Each component can also include a web application where the user can interact with the component in some way. A component does not however need to have a web application. The server code, the service which it implements, the web application and all other necessary code will then all be containerized into one container. 

In order to create a grpc server you will have to write a protofile where you define the data that can be passed to the server and the functions it implements. This protofile will be uploaded together with the docker image (container) to the AI builder platform. The protofiles will be used to connect the different grpc servers together. 

We can now expand our graph explaining the process of creating and deploying a pipeline. In the graph below, you can see how each of the components contain certain files, and how the playground orchestrates the grpc requests and responses to the different grpc servers. 

![pipeline creation graph](./pic1.1.3.PNG)


The steps to create a pipeline are the following:

# 1. Identify the problem, Objectives and Deliverable items
The first step to create a pipeline is to identify the problem, what you need the pipeline for. Once you have a clear understanding of the needs and requirements, it is easier to make a good decision on what components are necessary and how to implement them. By defining the problem and objectives clearly you should be able to identify the deliverable items. At this point it is also good to have a clear understanding of the required tools and technologies to simplify the planning.

# 2. Create the outline of the pipeline
Next you will need to clearly define how many components are needed and what each component should do. You should also decide on the additional required technologies, such as programming languages, for the components. When defining the components it is especially important to define the input and output of each of the components, as these are what connects the components together. For example the output from one component, such as a data cleaning component, will become the input of the next component, like the training component, when processing through the pipeline. Defining the inputs and outputs clearly also simplifys the coding as you have a clearer understanding of what is expected of each component.

# 3. Implementing the components
The third step is to start implementing the components. This includes writing the service code, testing the services, writing protofiles for each of the components, generating and integrating grpc code, potentially creating web applications and finally containerizing the components using docker. Most of the material cover the creation of a component as it is quite technical and requires multiple steps. 

After defining a component it is important to test it carefully before trying to deploy the pipelien, to ensure that the functionality has been implemented correctly and to simplify the debugging process. Testing the components throughout the entire creation process is imortant to ensure a reliable solution. We will also cover how to perform tests throughout the creation process.

# 4. Uploading the dockerimages
Once you have ensured that all components work well, you can containerize the applications and register them to Docker Hub. This step requires that you have defined the dockerfile for each of the components. Uploading to Docker Hub is done to simplify the process of uploading the components to the AI builder platform. 

# 5. Connect the components in AI builder
When the components have been finalized and uploaded to Docker Hub, you can use the AI builder platform to upload and connect the different components. This will create a deployable pipeline. This pipeline can either be deployed directly from the platform, or a solution.zip file can be download for local deployment using the playground app.

# 6. Running the playground app locally
In order to deploy the pipeline you will need to configure the playground app provided as part of the AI builder project for local deployment. This will be covered in a later chapter. 

