<a href="https://colab.research.google.com/github/victorviro/Machine-Learning-Python/blob/master/Planning_and_scoping_ML_projects.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Table of contents



1. [Introduction](#1)
2. [Why so many ML projects fail?](#2)
3. [Planning](#3)
    1. [Common assumptions](#3.1)
        1. [Business knowledge](#3.1.1)
        2. [Data quality](#3.1.2)
        3. [Functionality](#3.1.3)
    2. [Plan for demos](#3.2)
    3. [Planning meetings](#3.3)
4. [Scoping and research](#4)
    1. [Experimental scoping - Research](#4.1)
    2. [Experimental scoping - Experimentation](#4.2)
    3. [Why is scoping so important](#4.3)
    4. [Which imlementation should we use?](#4.4)
5. [Communication of projects](#5)
    1. [Defining the problem](#5.1)
        1. [What do you want to build?](#5.1.1)
        2. [What does the ideal end-state look like?](#5.1.2)
        3. [When should we meet to share the progress?](#5.1.3)
    2. [Critical discussions](#5.2)
        1. [Post research phase](#5.2.1)
        2. [Post experimentation phase](#5.2.2)
        3. [Development sprint reviews](#5.2.3)
        4. [MVP review](#5.2.4)
        5. [Pre-production review](#5.2.5)
    3. [How to talk about results?](#5.3)
6. [References](#6)

    




# Introduction <a name="1"></a>

Machine learning (ML) is challenging. With **so many algorithms, it’s impossible to learn even a small fraction of all there**. There are also **additional competencies** that a Data Scientist is expected to be familiar with. From mid-level **Data Engineering** (DE) skills, **software development** skills, project management skills, **visualization** and presentation skills, etc.

The work required of an ML professional is intimidating. However, they just need enough:

- Software development skills to write modular code and implement unit tests, but they don’t need to know about the intricacies of asynchronous messaging brokering.

- DE skills to build the ETL for feature datasets for their models, but not how to construct a PB-scale streaming ingestion framework.

- Visualization skills to create plots and charts that communicate clearly what their research and models are doing, but not how to develop dynamic web apps with complex UX components.

- Project management experience to know how to properly define, scope, and control a project, but not go through a PMP certification.

ML Engineering  is about applying Software Engineering fundamentals to ML. It is a set of **standards, tools, processes, and methodology** that aims to increase the chances of having efficient project work and making it easier to manage complex code bases. It attempts to eliminate the chances of projects getting canceled (budget, time, or complexity) or code being abandoned (unmaintainable or fragile code). It allows us to focus on solving the problems so that we can move onto solving more of them.

It is the roadmap to create more sustainable ML-based systems that can not only be deployed to production but can be maintained and updated, allowing businesses to reap the rewards of efficiency, profitability, and accuracy.

The figure below shows the fundamental steps of work involved in creating successful ML solutions.

![](https://i.ibb.co/SN5tVk2/MLE-roadmap.png)

ML Engineering is not just the path shown in the figure, it is also the **methodology within each stage**. It is how a Data Science team talks to the business about the problem, how research is done, the details of experimentation, the way that code is written, the tools and technologies used while traveling along the roadmap path that can reduce the worst outcome that a project will be abandoned.

In this notebook, we will see the first two steps of the roadmap (planning and scoping) and how effectively communicate in ML projects. But first, let's see what are the main reasons so many ML projects fail.

# Why so many ML projects fail? <a name="2"></a>

Many ML projects end up failing at an incredibly high rate. The 6 **major reasons of project failures** are shown in the figure below.

![](https://i.ibb.co/9prWH4t/ML-project-failures.png)

The **biggest reasons** for projects failing to meet the needs of a business are **in the planning and scoping phases** of a project.

Generally, project abandonment is due to a DS team that is either inexperienced with solving a large-scale production model to solve a particular need or has simply failed to understand what the desired outcome from the business is.

# Planning <a name="3"></a>

The planning phase attempts to **answer the next questions**:

- What do you want to build?
- When do you want it built by?

This will prevent the dreaded statement from the business team "This is not what we asked for".

Data scientists usually focus on the "how" to build something instead focus on the "what" and "when". 

For **example**, suppose we work at an e-commerce company that wants to apply **personalized recommendations** to the website. At the end, they want to increase the chances that the user will buy products. The business envite the ML team to a **first planning meeting** where the **business simply stated "We want personalization"**. This blanket **request is too vague**. **The business**, in their mental map of what personalization is, has assumptions that define what they are expecting. They **have deep knowledge of the rules that their products have**, how their inventory management system works, and information of **agreements** they have with the vendors that dictate the placement of certain brands on the website. 

The ML team then, spend time focused mostly on the technical details, thinking about how to solve the problem, looking for algorithms, APIs, and requisite data for those algorithms. Their **problem-solving focus from a modeling perspective made them blind to the business rules and requirements that may govern how their solution should work**. They weren’t asking the right questions to the business and hence, basic details of expectations were not communicated. The ML teams facing the interaction in the meeting must push for detailed explanations of expectations and requirements ("how it should work?", "what types of restrictions need to be in place on the recommendations?", "there should be a consideration to how the products are displayed?", etc). **The single question "What do you do now to decide what products to display in what places?" can reveal the critical feature requirements**.

The **fewer of these critical requirements that are known** before writing code and building a solution, the **greater the complications** in the delivery of an experimental prototype **and the chances that the project will be abandoned**. Since ML frequently intends to solve complex problems full of details that are specific to each business, **these struggles** are going to be **inevitable** and the best way to **minimize their impact** is to have **thorough discussions** that aim to **capture as many details about the problem, the data, and the expectations as possible**.

The vast majority of ML is a replacement for boring or repetitive work done by humans, and there is likely someone at the company that is attempting to accomplish the same thing. One of the easiest ways to approach this subject is by finding those people and asking them, **"teach me how you do it now, please"**.

After these details are ironed out, then a plan for model implementation experimentation can be developed.

Let's see some of the most common assumptions in ML.

### Common assumptions <a name="3.1"></a>

#### Assumption of business knowledge <a name="3.1.1"></a>

In the vast majority of organizations, **ML practitioners are insulated from the inner workings of a business**.

A good solution for arriving at these details is by having a **subject matter expert (SME)** who can explain how they actually are solving the problem. Going through this exercise would allow for everyone to **understand the specific rules** that may be applied to govern the output of a model.

In our example of the recommendation engine, an SME can explain how they decide the ordering of products on each page of the website. In this particular case, vendor contracts dictate the presentation of certain products on their website. So modelers should know that there is a business process that some suppliers of goods would be promoted on the site over others.

#### Assumption of data quality <a name="3.1.2"></a>

Data scientists usually spend **a lot of time** messing around with just **getting the data ready for modeling**. Even for companies that are deploying many ML models to production successfully, they still struggle with data quality issues regularly. **These problems are caused by the frequently complex systems that are generating the data**. The proper way to handle these problems is to anticipate them:

- **Validate the data** that will be involved through analysis before modeling begins.

- **Ask questions about the nature of the data** that will be involved **to the SME’s** who are most familiar with it (in the recommendation engine, "do all products get registered in our systems in the same way?").

**Although some issues might not be corrected for a demo** phase (due to the volume of work required and the prototype not be delayed for too long), **it must be communicated** to measure the success of the prototype.

For the case of the recommendation engine, there was an issue of duplicated item data due to the retiring of older product ID’s causing that the shoe division used a separate product ID for each color of a style of shoe. Statistical reports may uncover this issue, particularly if the unique product count of shoes was orders of magnitude higher than any other category. The question "Why do we sell so many shoes?", posed during a planning meeting, can instantly uncover this issue with the shoes, but also a deeper validation of all product categories to ensure that the data going into the models is correct.

#### Assumption of functionality <a name="3.1.3"></a>

In the instance of the recommendation system, the business leaders are annoyed that the recommendations show a product that was purchased the week before. There is a need for expressing **how off-putting this would be to the end-user** to see this happen so the entire team can **decide if this is an important requirement that needs to be added when building the prototype or can be delayed to be discussed in following meetings**, and add it to the final product if the team concluded it.

### Plan for demos <a name="3.2"></a>

If the ML team attempts to **show a** presentation of their **demo to the larger team only once**, there is an inevitable **risk of large amounts of rework**. Without frequent demos as features are built, the team at large is simply relying on the word of the ML team, and the ML team, not having SME members, is relying on their notes that they took when asking questions in the planning meetings. For most projects, **there are too many details and nuances to confidently approach building out dozens of features without having them reviewed**.

In order to effectively produce a complex project, the **SME group needs to provide feedback** based on data that they can consume.

By **planning for demos ahead of time**, the ML agile development process can **adapt to the needs of the business experts**. They can embrace a true Agile approach: of **testing and demonstrating features as they are built, adapting their future work and adjusting elements** in a highly efficient manner.

The next figure show the two opposite approaches of ML development.

![](https://i.ibb.co/QK6WPC6/ml-development-plan.png)

The main drawback with silo-development (top) is an inevitable risk of large amounts of rework. Sticking with an Agile approach to feature development, even during experimentation and prototype-building will help to ensure that the features that have been added actually meet the requirements of the SME team.

### Planning meetings <a name="3.3"></a>

It’s best to **keep charts, plots, and metrics internal to the ML team**. The non-technical audience will not understand them. Moreover, the amount of time spent explaining what these mean is going to derail the demo so the audience will no longer be thinking about the problem, but rather about what these metrics mean).

Instead, we must **focus on visualizations that can convey that we’re solving the problem** with our modeling **in terms that the business team will be familiar with**.

The appropriate usage of an ideation discussion is in talking. It is in drawing pictures, explaining how something should work as an ideal end-state. Instead of thinking about the algorithmic solutions to build the required inference to solve the problem, the primary focus of ideation and planning is to start at the solution side, then work backward through functionality to ensure that the critical aspects of the project that will mean most to the internal (and external) customers are met. Only after these details are ironed out, then a plan for model implementation experimentation can be developed.

# Scoping and research <a name="4"></a>

Once we have been through the planning phase of the project, we know important details for the business and critical features that need to get built. Now it’s time to **plan our research**. The focus of scoping and research needs to **answer some questions**.

- What are we going to build?
- How long is this going to take to build?
- How much complicated and expensive this is going to be?

This will prevent the dreaded statements from the business team "Why is this taking so long to build?" and "We have seen 9 demos and none of them have worked".

Making a conjecture about how long a project will take, which approach is going to be most successful, and the amount of resources that will be needed is a challenging exercise. The risk associated with making erroneous claims is high.

Knowing the critical aspects, the team can begin **planning what work is estimated by setting expectations and boundaries** (for both time and level of implementation complexity). The ML team then can **provide to the business an expected delivery date** and a judgment call on what is or isn’t feasible.

Before estimating how long a project is going to take, it is **recommended to do some research**, especially, if the project is quite complex. The ML team can review the latest research about what others have built in this space, and evaluate algorithms, platforms... The data engineering team will be working closely with the ML team, generating data sets.

To show the business what’s going on, it is useful to have a plan for experimentation, to set boundaries on what will be researched, attempted, and what risks are present in each phase.

#### Experimental scoping - Research <a name="4.1"></a>

Most **ML practitioners love to experiment** and learn new things. With the depth there is in the ML space, **if the research and experimentation are left without boundaries**, the ML team could easily **spend the entire project timeline**. We should set boundaries around how long and how far we will go when researching a solution to a new problem. We want to **find**, perhaps not the "best solution", but **a "good enough solution"** to ensure that we’ll eventually get a product built out of our work. The business **don’t care how it’s built, so long as it’s built correctly and on time**.

We’re solving a business problem, not creating a new algorithm. For most use cases of ML, there are likely many people who have done it before. After a few **internet searches, and whitepaper readings**, the team can **identify** the **"broad strokes" for existing solutions in industry**, and the state of current research. For a search on technical implementations, one might find millions of results. It’s important to **recognize the frequently mentioned algorithms and approaches**, read a bit more on those, and determine what the most widely used direct applications are. 

The approaches that are candidates for testing (within the limits of the team’s capacity) are culled at the first stage. After a short period of research, some candidates are dismissed, resulting in a **limited number of alternatives to test against one another**.

Once these paths have been agreed upon, the team can set out to attempt to build prototypes. 

#### Experimental scoping - Experimentation <a name="4.2"></a>

The **goal of experimentation is to produce a simulation of the end product that allows for a comparison of the solutions**. We don't need to tune models or write production-ready code. The key is a balance between speed and comparability.

There are many things to consider when determining which approach to take, but the key is to **estimate the performance of the solutions, and the difficulty of developing the full solution**. 

**At the conclusion of this phase**, the final code complexity can be estimated, **informing the larger team the development time estimated** that will be required to produce the project’s code base, **and the daily run cost** will be needed to retrain the models, generate inferences, host the data, and serve the data.

It can be helpful to create a **testing plan for the experimentation**. This plan, devoid of technical details, can be used **to track the status of the prototypes** that the ML team will be doing. It can also be used as a communication tool to the larger team, helping to show what it was done, what the results were, and can accompany a demo of the implementations.

An additional helpful visualization to provide to the larger team when discussing experimental phases is an estimation of what the "broad strokes" of the solution will be from an ML perspective. A complex architectural diagram isn’t needed at this point, as it will change during development. However, a high-level diagram can help explain to the broader team what needs to be built to satisfy the solution.

**A note on experimental code quality**: Experimental code cab be a little "janky", filled with charts, print statements, etc. It’s an experiment, we likely won’t have time to be creating classes, methods, interfaces, etc. We must not worry about the state of the code at the end of experimentation. It should serve as a reference for the development phase in which proper coding is done, building maintainable software, and using standard software development practices.

#### Why is scoping so important? <a name="4.3"></a>

**Had the ML team all the time** in the world (and budget) and they could sift through hundreds of white papers, and even inventing a novel approach. **They could spend months just researching** the best possible solution. **Instead of just testing two or three approaches** that have been proven to work for others in similar industries, they could work on **building prototypes for dozens of approaches**.

In the real world, companies don't pursuit the finest application to solve a problem. Building a prototype takes time and effort of learning new APIs, researching,... The **longer time is spent during the experimentation, the more amount of money** will be spent to make this project a reality (salaries of this team). Moreover, the **shortened time is intended to arrive at a decision boundary sooner**. If the team can set a cap on how long they’re going to be trying to make it work, they can get an estimate on how hard it would be to build the entire solution. If it takes 2 weeks to get even the first portion of an approach, then how long will it take to integrate the other requirements into this solution? It may be worthwhile to abandon one such complex or expensive approach in favor of others. If the training time is large, that may inform decisions to make a cost-benefit analysis. Perhaps the model will be powerful, but will it pay for the cost to run it each day?

During the experimentation, there will inevitably be elements that are missing from the overall data architecture of the business, and hence, an **analysis of the cost to maintain a project is nigh impossible to estimate accurately at this stage, but it is an important aspect to consider**:

- What additional ETL do we need to build?
- How often do we need to retrain models and generate inferences?
- What platform are we going to use to run these models?
- For the platform we need, are we going to use a managed service, or are we going to try to run it ourselves?
- Do we have expertise in running and maintaining services for ML of this nature?
- What is the cost of storage and where will the inference data live to support this project?

Answer all **these questions** before development begins is not needed, but they should be in mind and **revisited throughout the development process**. If there isn’t enough budget to run one of these engines, then perhaps a different project should be chosen.

The **time blocking** isn’t intending to force unrealistic expectations on the team, but to **prevent the team from wasting time** and energy. It helps with **never-to-be-realized** implementations (we can only build one solution for a project). It's **less painful to throw away an implementation if the team has only been working on it for a week** rather than months. Time-blocking is particularly critical to experimentation **when there is no one on the team who has solved a similar problem before**. In these cases, it is common for the team, during their research phase, to find a large list of possible solutions. Some of them will not solve the problem. It’s best to find this early so that the team can pivot to worthwhile solutions.

### Which implementation should we use? <a name="4.4"></a>

**Predictive capabilities** of ML is the **primary focus in academia, and research**, it is not surprising that most Data Scientists are pretty horrible at shifting their focus to solving a problem with a **balance between accuracy, cost, and timeliness of implementation**. Eventually, it’s inevitable to prioritize speed of delivery, since the business unit or customers are waiting for that model, and there are other problems at the company we need to solve.

It can be very useful to create a **weighted matrix report** that the larger team can use **to decide which implementation to use**. It gives a data-driven decision to the team to select amongst the various tradeoffs that each implementation would have. The figure below shows an example of one such weighted matrix report,

![](https://i.ibb.co/SNrMnsZ/weighted-decision-matrix.png)

- If this matrix, were populated by an **ML team**, they might employ **heavy weightings to "Prediction Quality"**.
- A team of **ML Engineers** would likely over-emphasize **"Maintainability" and "Implementation Complexity"**. 
- The **Director** of Data Science might only care about **"Cost to run"**.
- The **project lead** is only interested in **"Prediction Quality"**. 

It is a **balancing act**. With **more people debating and explaining their perspectives**, a more informed decision can arrive at that can help to ensure a successful and long-running solution.

# Communication of projects <a name="5"></a>

One of the **biggest challenges** that DS teams have in getting their implementations to be used by a company is to **communicate effectively**.

In most companies, the **non-technical person is unfamiliar with what a DS team does**. It’s understandable, as the nature of problems that can be tackled is so varied and the work that a DS team at a company might do is wildly different from another company. This means that project work for **ML teams needs to focus heavily on communications**. For a project to be successful, communicating to the business, the SMEs in the project group, and even other members of the Engineering organization needs to be a top priority, from the initial ideation sessions to a model’s maintenance period when it’s been deployed.

## Defining the problem <a name="5.1"></a>

The discussion of the **first meeting in the planning phase** has to revolve around the **final stage of what is wanted**, in highly abstract terms.

- **Technical people** tend to **focus on the "how"** of a project. How am I going to build this? How is the system going to integrate this data? How can I use different ML algorithms?

- The **Project Manager** will be focused **on the "when"** of the problem. When can we see a prototype of the solution?

- The **project owner** will be focused **on the "what"**. What lift is this going to give the business? What effect is this going to have on our customers? What is the risk if this goes poorly?

**None of these questions matter at this point**. The question that everyone should discuss is **"why we are building this?"**. This opens a frank conversation about **what needs and functionalities to be built**, what is expected at the presentations and ideation sessions throughout the project, and **what design requirements and issues the business has**. It also gives a sense of focus to the larger team. 

The ML team needs to understand what the business is expecting in a solution **by asking questions** about what they want, what the model should be doing, and what the customers would want to see.

By scoping and defining the problem, enormous **wastes of time and resources can be entirely preventable in the early stages of the project** (before a single line of code is written).

Let's see some of these **fundamental questions**.

#### What do you want to do it? <a name="5.1.1"></a>

By focusing on the functionality of the project’s goal, the **product team** can be **involved in the discussion**, and the entire team can **plan for the nuances of the business** that need to be thought.

**The ‘how’** is fun stuff, it’s complex, and it’s **fascinating to learn**, but only **for the ML team**. The rest of the team doesn’t care what modeling approaches are going to be used. It is better to **keep these details out of group discussions**.

The easiest way is through a **simulation of the final end-state** of the system of **what the project is aiming to do**. By using flow-path models to figure out **what the entire team expects** allows to inform the ML team on the **details needed to limit the options** for "how" they’re going to be building the solution. The ML team should save architecture diagrams and modeling discussions to internal discussions within the ML team. Breaking out a potential solution from the **perspective of a user** allows us to discuss the important aspects, and it also **opens the discussion to the non-technical team members** who will have insights to consider that will impact the experimentation and development phases.

A **flow-path diagram** must be simple and easy to see the **bare-bones functionality** of the system, while hiding the complexity and implementation details, the discussion can begin with **every person** in the room being **engaged and able to contribute to the ideas** of what will define the project’s initial state. Many of the ideas that are presented would likely have not been considered by the ML team had the product teams and SME’s not been a part of the discussion.

A **user experience journey** is a **simulation of a product**, exploring how a new feature or system will be consumed by a particular user. Showing how the system will interact with the data that we’re producing can aid in designing the ML solution to best meet the needs of the "customer". It can help find areas that can inform elements that may need to be considered as critical features during the development of the solution. It could be discovered that the assumptions from different people are in conflict. The architecture maybe needs to change. It’s easier to find this now and be able to attribute scoped complexity for this in the planning phase before a model is built. 

In initial **planning meetings, everyone wants to brainstorm** and work towards the "most amazing solution", but to get something into production, realism should creep in. A **focus** should be made **on the essential aspects** of the project to make it function correctly. **Ancillary ideas** should be **recorded**, modified, and referred to throughout the experimentation and development phases of the project. We don’t want to ignore ideas, but also **don’t allow every idea to make it into the core experimentation plan**. If the idea seems complex, we can revisit it later once the project is taking shape and the total project complexity is known to a deeper level.

The simple question **"how should this work?"** is arguably the most important question to ask to ensure that everyone involved in the project is on the same page.

#### What does the ideal end-state look like? <a name="5.1.2"></a>

The ideal implementation is hard to define at first (before any experimentation is done), but it’s useful to **hear all aspects of the ideal state**. **Creative discussions with SMEs** could lead to a unique and more powerful ML solution than what we may have come up with on our own. It **helps to**: 

- Shift our thinking into creative ways.
- Engage the person asking for the project to be built and allows their perspective, ideas, and creativity to influence the project in positive ways.
- Build trust and a feeling of ownership in the development of the project.

**Listen to the needs of a customer of our ML project** is an important skill, far more than mastering any algorithm, language, or platform. It will **help guide what we’re going to try and research**, and how to think differently about problems.

A sketch of ideal state **will likely not be what the final system will be**, but it will **inform the direction of experimentation**, as well as the areas of the project that the team will need to research thoroughly to minimize or prevent unexpected scope creep.

The excitement and ideas from creative people are infectious and can create a truly amazing company that does a great job at its mission. However, **without tempering and focus, the size and complexity of a solution can quite rapidly spiral out of control**. An appropriate level of discussion and critical aspects of a project needs to be discussed in detail before a single character is typed in an experimentation notebook or IDE.

**Note**: Some of the ideas proposed may be unrealistic. We must thank the person for their idea, and gently explain in non-technical terms how it’s impossible at this time and move on to finish the project.

#### When should we meet to share the progress? <a name="5.1.3"></a>

Due to the complex nature of most ML projects, meetings are critical, but **not all meetings are created equally**. While it's tempting to have "cadence meetings", project **meetings should coincide with milestones associate with the project**. These meetings should:

- Not be a substitute for daily standup meetings
- Not overlap with team-focused meetings of individual departments
- Always be conducted with the full team present, and the project lead present to make final decisions
- Be focused on presenting the solution as it stands at that point

At the early meetings, the ML team should **communicate to the group the need for these event-based meetings**; to let everyone know that changes that might seem insignificant to other teams could have rework risks associated with them.

**People who are not involved in the project** can be curious and **would like to provide feedback** but it **introduces chaos** to the project that is difficult to manage.

The next figure illustrates a high-level **Gantt chart** of the milestones associated with a general e-commerce ML project, focusing on the main concepts. It serves as a communication tool that can improve the productivity of the teams and reduce a bit of the chaos in a multi-disciplinary team.

![](https://i.ibb.co/Fx1tQmc/project-meeting-timeline.png)

This chart **shows** with the milestone arrows along the top, there are **critical stages where the entire team should be meeting** together to ensure that all members of the team understand the implications of what has been developed and discovered. It helps to minimize the amount of time wasted on rework and makes sure that the project is on track to do what it set out to do.

While it’s not needed to spend time to create Gantt charts for every project, it is advisable to create at least something to track progress and milestones against.

It doesn’t make sense for solo outings where a single ML Engineer is handling the entire project. Even in this situation, figuring out where major boundaries exist within the project’s development are and scheduling a bit of a show-and-tell can be extremely helpful.

## Critical discussions <a name="5.2"></a>

**“Where do we set these boundaries for a project?”** While **each project is unique** with respect to the amount of work required to solve the problem, the number of people involved, and the technological risks surrounding the implementation, there are a few **general guidelines of stages** that are helpful to set as **"minimum required meetings"**. A schedule around when we will all be meeting, what we will be talking about, what to expect from those meetings, and how the active participation from everyone involved in the project will help to minimize risk in the timely delivery of the solution.

A regularly scheduled program (for instance, every Wednesday at 1 pm) does not work. The more efficient way is to meet to **review the solution-in-progress only when there is something new to review**. But, how do we define where these boundaries are to balance the need to discuss elements of the project with the exhaustion that comes with reviewing minor changes with too-frequent work-disrupting meetings? The fewer and more focused meetings that are had, the better.

#### Post research phase discussion <a name="5.2.1"></a>


This meeting is the second to the last time for the team to raise a white flag if they’ve discovered that the project is untenable, will end up taking more time or money than the team has allocated. The question that should be dominating is: **"Can we actually figure this out?"**. It **should not focus on the implementation details, but on the results of the research phase**. 

The ML team need to show to the SME’s, in a **language that the audience will comprehend**, the status of what they’ve discovered, **what the options are, what can be done** with each of the solutions, what is impossible, and **when they can expect to see a prototype** from them to see which one they like better.

- How is the progress towards the prototype coming along?
 - Have you figured out any of the things that you’re testing yet?
 - Which one looks like it’s the most promising so far?
 - Are you going to stop pursuing anything that you had planned to test?
 - Are we on track to having a prototype by the scheduled due date?
- What risks have you uncovered so far?
 - Are there challenges with the data that the Data Engineering team needs to be made aware of?
 - Are we going to need a new technology, platform, or tool that the team isn’t familiar with?
 - As of right now, do you feel as though this is a solvable problem for us?

These questions are all designed to evaluate whether or not this project is tenable from a personnel, technology, platform, and cost perspective.

Don't bring up all of the options explored or something that has amazing results but will likely take 2 years to build. Instead distill the discussion to the **core details** that are required to get the next phase going: experimentation. The **experimental testing phase may test out a dozen ideas, but only present the two most acceptable to a business for review**. If the implementation would be costly, or complex, it’s best to present to the business only the options that will guarantee the greatest chance of project success. Even if they’re not as fancy as other solutions (the ML team has to maintain the solution).

**Note**: When developing an ML solution for a company, the question of **"possible to solve"** for a problem is not one of whether it is possible to solve the problem, rather if **it’s possible to create a solution in a short enough period of time so as not to waste too much money and resources**. The reluctance that creators usually have of abandoning their creations grows as time and energy spent towards that creation increases. If we can call a halt to a project early enough (recognize the signs that this is something that is not worth pursuing), we will be able to move onto something more worthwhile.

#### Post experimentation phase <a name="5.2.2"></a>

During the phase of experimentation, the ML team build several prototypes. In the previous meeting, the pros and cons for these prototypes were presented and critical issues were listed out. Now it’s **time to see what a prototype of the solution looks like**, by **showing a mock-up of the core features**. The **full implementation should not be done by this point**, but merely **simulated** to show what the eventually designed system would look like. The same questions should be asked as in the preceding meeting, except tailored to the estimations of capability for developing the full solution. During these demonstrations, it's helpful to use real data. If we’re showing a demo of inferences to a group of SME members, then we show their data if we can. We record each positive, but more importantly, each negative impression that they give.

The **results of this meeting** should be **similar to** those from the **initial planning meeting**. **Additional features** that weren’t recognized as being important **can be added** to the development planning and if any of the original features are found to be unnecessary, they should be removed from the plan.

With the experimentation phase out of the way, the ML team can **explain that the "nice to have" elements** from earlier phases are not only doable but **can be integrated without a great deal of extra work**. Some features can be also maintained as a "nice to have". If it is found that, during development, the integration of this feature would be attainable, then it is left as part of this living planning document.

The most important part of this stage’s meeting is that **everyone on the team is aware of the elements and moving pieces that are involved in the project at this stage**. It ensures that the team understands what elements need to be scoped, what the general epics and stories are that need to be created for sprint planning, and that a collaborative estimation of the implementation is arrived at. At the end of the demonstration, the entire team should have the ability to gauge whether the project is worth pursuing. There is no need to discuss technical details of implementation during this, however.

Depending on the project, **a "terrible" prototype can be improved** (tuning the model, augmenting the feature set, etc), **or can be impossible to improve** (data doesn’t exist to augment, the technology to solve that problem doesn’t exist yet, etc). It’s important to quickly distill the reasons why certain issues that are identified are happening. If the reasons can be modified by the ML team, then simply answer as such. But if the problem is intensely complex, the response should be either thoughtfully articulated to the person, or recorded for a period of additional research, capped in time and effort to such research. We need to make sure that we know what is and what is not possible before coming into a prototype review session.

#### Development sprint reviews <a name="5.2.3"></a>

During these meetings, the teams should focus on **milestones of the current state of the features being developed**. It is useful to use the same approach that was used in the experimentation review phase, **using the same prototype data so that a direct comparison between earlier stages** can be seen. It helps to have a common frame of reference for the SME group to gauge the subjective quality of the solution.

**Simulations of the solution should be shown** to ensure that the business team and **SME’s can provide relevant feedback** to details that might have been overlooked by the engineers working on the project. They should be discussed in abstract terms (without specific details of software development or model tuning). It’s more effective to say, "We’re on the model training phase this week" rather than, "we’re attempting to optimize for RMSE through cross-validation of hyperparameters".

If at a previous meeting, the quality of the predictions was determined to be lacking in some way or another, an update and a demonstration of the fix should be shown to ensure that the problems were actually solved (show to the SME group the fix with the same data that they originally identified the problem).

As the project moves further, these meetings should become shorter in duration and more focused on the integration aspects of the project. **At the final meeting, the SME group should be looking at an actual demonstration**. 

As the complexity grows, it can be helpful to push out builds of the QA version of the project to the SME team members so that they can evaluate the solution on their own time, bringing their feedback to the team at a regularly scheduled cadence meeting.

**Unforeseen changes**: ML projects are "complex". Aside from the modeling, the interrelated rules, conditions, and usages of the predictions can be complex, and **things will be missed or overlooked even in the most thorough planning phases**. Perhaps the data doesn’t exist or is too costly to create to solve a particular problem in the framework of what has been built. With a few changes in approach, the solution can be realized, but it will be at the expense of an increase in complexity or cost for another aspect of the solution. This is part of ML. **When a blocker arises, it should be communicated** clearly to everyone that needs to know about the change. Don’t silently hack a solution that seems like it will work and not mention it to anyone.

#### MVP review <a name="5.2.4"></a>

At this point, the internal engineering reviews have been
done, the system is working correctly, the integration tests are passing, latencies have been tested at a large burst-traffic scale, etc.

The MVP review is the stage at which **subjective measures of quality are done**. Metrics are good for ML implementations (and should be used and recorded for all modeling), but the **best gauge of determining if the solution is qualitatively solving the problem is to use the knowledge of internal users and experts who can use the system before it’s deployed to end-users**.  An effective technique that can be employed is a survey. It allows for standardization in the analysis of the responses, giving a broad estimation of what additional elements need to be added or modified. We should embrace the feedback, particularly if it is negative, and come up with ideas collectively to address these concerns.

This meeting may uncover glaring issues that were missed in the planning, experimental, and development phases. The critical aspect of this evaluation is to ensure that the members evaluating the solution are not in any way vested in the creation of it, nor are they aware of the inner workings. Nearly every project is accompanied by a large **creator bias**. The creators can overlook and **miss important flaws** in it due to familiarity and adoration of it. **If at the end of one of these review meetings, the only responses are overwhelming positive praise of the solution, it should raise some concerns with the team**. Emotional bias for the project may cloud the judgment of its efficacy. If this is noticed, we must pull in others at the company who have no stake in the project.

If we ever find ourselves having made it through this evaluation without a single issue being found, either we’re the luckiest team ever, or the evaluators are completely checked out. This is common in smaller companies where nearly everyone is aware of an invested in a solution. It can be helpful to bring in outsiders in this case to validate the solution. Many companies engage in alpha or beta testing for this purpose; to elicit high-quality feedback from customers that are invested in their products and platforms.

Successful releases involve a stage after the engineering QA phase is complete in which the solution undergoes [user acceptance testing](https://en.wikipedia.org/wiki/Acceptance_testing) (UAT).

#### Pre-production review <a name="5.2.5"></a>

At this point, final modifications and features have been added, feedback has been addressed, QA checks have passed, the system has run without blowing up for several days. Everything from an Engineering point has passed all tests. The last thing to do is to "ship it" to production.

During this final meeting before release, should be structured as a project-based retrospective and **analysis of features**. Everyone should be asking: "Did we build what we set out to build?". A **comparison between what was originally planned, what features were rejected for out of scope, and what had been added should be reviewed**. This can help inform expectations of the analytics data that should be queried upon release.

Metrics are already set up for collecting performance, analytics **datasets have been created, ready to be populated for measuring the success of the project**. This meeting should also focus on **where that data set is going to be located**, how engineers can query it, and what charts and reports are going to be available (and how to access them) for the non-technical members of the team. A preparation work to ensure that **people can have "self-service" access to the metrics and statistics for the project** will ensure that critical data-based decisions can be made by everyone in the company, even those not involved in the creation of the solution.

Hopefully, all scenarios have been tested long before this point, but it’s important to engage with the entire team to ensure that the functionality is conclusively confirmed to be implemented correctly. After this point, there’s no going back; once it’s released to production, it’s in the hands of the customer. Think of the damage to the reputation of the project if something that is broken is released. 

**A note on patience**: There are **many factors** that can **affect the perceived success or failure of project**, **some** of them **within the control of the design team, others** completely out of control and **unknown**. The efficacy of the design needs to be withheld until a **sufficient quantity of data is collected about the performance** of the solution in order to make a statistically valid adjudication.

## How to talk about results <a name="5.3"></a>

**Explaining how ML algorithms work to a non-technical person is challenging**. Analogies, experiment-based examples, and comprehensible diagrams to accompany them are difficult at the best of times (when someone is asking for the sake of curiosity). When the **questions are poised by members of a cross-functional team that are trying to get a project released**, it can be even more challenging and mentally taxing (since they have a basis for expectation regarding what they want the solution to do).

In any project’s development, **these questions will invariably come up**. The questions below are specific to the example recommendation engine that we’ve been discussing.

- "Why does it think that I would like that? I would never pick something like that for myself!"
- “Why is it recommending umbrellas? That customer lives in the desert. What is it thinking?!"
- "Why does it think that this customer would like t-shirts? They only buy haute couture."

The **answer** for these questions is **"We don’t have the data to answer that question"**. It’s best to exhaust all possibilities of feature engineering creativity before claiming that, but assuming that you have, it’s the only actual answer that is worth giving. The **figure** below, shows a helpful **visualization for explaining the concept of what ML can, and what it cannot do**.

![](https://i.ibb.co/WnGs1KC/data-in-ML.png)

Perhaps the data is of such a personal nature that there is simply no way to infer this or collect this information. Perhaps the data to be collected would be so complex, expensive to store, or challenging to collect that it’s simply not within the budget of the company to do so.

When an SME asks, "why didn’t this person add these items to their cart if the model predicted that these were so relevant for them?", there is no way we can answer that. It’s understandable for that SME to want to know why the model behaved a certain way and the expected outcome. **Instead of dismissing the question, we can simply posit a few questions while explaining the view of reality that the model can "see"**. Perhaps the user was shopping for someone else? Perhaps he is looking for something new that they were inspired by from an event that we can’t see in our data. Perhaps he simply just wasn’t in the mood. There are a lot of latent factors that can influence the behavior of events in the "real world".  

Explaining limitations in this way can help to **dispel assumed unrealistic capabilities of ML to non-technical people**, and eliminate disappointment and frustration. As so many people have said throughout the history of business, **"it’s always best to under-promise and over-deliver"**.

We will always have more success **explaining concepts in familiar terms** to our audience and approaching complex topics **through examples** rather than defaulting to exclusionary dialogue that will not be fully understood by others on the team who are not familiar with the inner workings of our profession.

# References <a name="6"></a>

- [Machine learning engineering in action](https://livebook.manning.com/book/machine-learning-engineering/)