<a href="https://colab.research.google.com/github/glimmer-jm/Projects/blob/main/Data_Science_and_MLOps_Landscape_in_Industry.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction: Navigating the Data Science and MLOps Landscape in Industry

As a smart supply chain engineer, I’ve seen firsthand how the intricate dance of logistics, inventory, and demand forecasting can make or break an organization’s success. My passion for machine learning (ML) and deep learning (DL) stems from their transformative potential to optimize these systems—turning traditional supply chains into agile, data-driven networks capable of anticipating disruptions and maximizing efficiency. The promise of artificial intelligence (AI) isn’t just theoretical; according to the [McKinsey Global Survey](https://www.mckinsey.com/capabilities/quantumblack/our-insights/global-survey-the-state-of-ai-in-2021) on the State of AI in 2021, a majority of organizations reported adopting AI capabilities by that year, with its impact on the bottom line growing significantly. From cost reductions to revenue boosts, AI is proving its worth across industries—yet, as the survey notes, operationalizing these technologies at scale remains a formidable challenge.

This tension between potential and practice fuels my curiosity about the broader Data Science and MLOps landscape. While the McKinsey findings highlight that 56% of respondents had adopted AI in at least one function by 2021—up from 50% in 2020—my experience in supply chain optimization tells me that building a model is only half the battle. Deploying it into production, where it can dynamically adapt to real-world complexities like fluctuating demand or supplier delays, requires a robust framework. Enter Machine Learning Operations (MLOps), a discipline that bridges experimentation and implementation, ensuring models don’t just sit on a laptop but drive tangible outcomes. Inspired by this, I turn to the [2022 Kaggle Machine Learning & Data Science Survey](www.kaggle.com/competitions/kaggle-survey-2022/overview/description)—a treasure trove of 23,997 responses collected from September 16 to October 16, 2022—to explore how industry professionals like me are navigating this shift.

In this project, I aim to uncover the state of AI adoption and MLOps practices among industry practitioners, with a lens sharpened by my supply chain perspective. The Kaggle dataset offers a granular view of the tools, workflows, and roles shaping this field, from cloud platforms powering DL models to the skills defining AI careers. How widespread is ML deployment in enterprises today, and does it mirror the upward trend McKinsey observed? Are supply chain-heavy sectors like manufacturing or retail leading in MLOps maturity? What tech stack dominates, and are deep learning methods—crucial for tasks like predictive maintenance or demand forecasting—truly scalable in business settings? By focusing on employed professionals (not students) who specify their industry, I’ll extract insights from roughly 37.9% of the survey’s responses, painting a picture of a field at the cusp of revolution. Through data exploration and storytelling, this analysis will reflect both my enthusiasm for ML and DL and my commitment to understanding how they reshape industries—one supply chain at a time.

## How This Ties to Your Input:
- **Personal Perspective**: Like your intro, it reflects a practitioner’s curiosity about AI adoption and MLOps, though I’ve generalized it slightly to fit a broader audience.
- **MLOps Focus**: It emphasizes operationalizing ML, inspired by your mention of moving models to production and the Nvidia Blog reference.
- **Survey Context**: It integrates the 2022 Kaggle Survey details, aligning with your analysis target.
- **Key Questions**: It adapts your topics (e.g., ML adoption, tech stack, job outlook) into a cohesive narrative, while keeping the scope manageable.
- **Methodology Hint**: It nods to your filtering approach (professionals only) without diving into technical details yet.

## Next Steps:
- **Refinement**: Let me know if you’d like it shorter, more technical, or focused on a specific industry (e.g., banking, as you mentioned).
- **Expansion**: I can add a specific hook (e.g., “In banking, where I work…”) or preview a key finding if you’ve already explored the data.
- **Visuals**: If you want to pair this with an image (e.g., an MLOps pipeline diagram), I can suggest asking for confirmation to generate one.

In [1]:
import pandas as pd
import numpy as np
import json
from collections import Counter
import matplotlib.pyplot as plt
import matplotlib.pyplot as plt
import seaborn as sns

import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.figure_factory as ff
import plotly.express as px
from plotly.offline import init_notebook_mode ,iplot
from plotly.colors import n_colors

from IPython.display import display, HTML, Javascript
import IPython.display
from IPython.display import display, clear_output, Image
import ipywidgets as widgets

import warnings
warnings.filterwarnings('ignore')

In [4]:
#load the data
df = pd.read_csv('/content/drive/MyDrive/Projects Colab/kaggle_survey_2022_responses.csv')
questions_titles= df[0:1]
df = df[1:]

In [5]:
#Verify the load
print('Questions Titles (First Row:):')
print(questions_titles.iloc[0])
print("\nDataset Shape:")
print(df.shape)

Questions Titles (First Row:):
Duration (in seconds)                                Duration (in seconds)
Q2                                             What is your age (# years)?
Q3                                  What is your gender? - Selected Choice
Q4                               In which country do you currently reside?
Q5                       Are you currently a student? (high school, uni...
                                               ...                        
Q44_8                    Who/what are your favorite media sources that ...
Q44_9                    Who/what are your favorite media sources that ...
Q44_10                   Who/what are your favorite media sources that ...
Q44_11                   Who/what are your favorite media sources that ...
Q44_12                   Who/what are your favorite media sources that ...
Name: 0, Length: 296, dtype: object

Dataset Shape:
(23997, 296)
