author | title | semester | footer | license |
---|---|---|---|---|
Claire Le Goues & Christian Kaestner |
MLiP: Motivation, Syllabus, and Introductions |
Spring 2024 |
Machine Learning in Production/AI Engineering • Claire Le Goues & Christian Kaestner, Carnegie Mellon University • Spring 2024 |
Creative Commons Attribution 4.0 International (CC BY 4.0) |
We use Slack for this course, including during lectures
See signup link on Canvas
Setup the ability to read/post to Slack during lecture
¯\_(ツ)_/¯
About 120 students waitlisted
Best guess: 40 more people will get in, but it may take a few days
For those joining late:
- Ask us for recording of missed lectures on Slack
- Post introduction on Slack (
#intro
) when joining - See Canvas for automatic extensions and makeup opportunities for quizzes, labs, and homeworks
- Automatically excused for participation in missed lectures
- Understand how ML components are parts of larger systems
- Illustrate the challenges in engineering an ML-enabled system beyond accuracy
- Explain the role of specifications and their lack in machine learning and the relationship to deductive and inductive reasoning
- Summarize the respective goals and challenges of software engineers vs data scientists
- Explain the concept and relevance of "T-shaped people"
- Preliminaries (just done)
- Case Study
- Syllabus
- Introductions
Take audio or video files and produce text.
- Used by academics to analyze interview text
- Podcast show notes
- Subtitles for videos
State of the art a few years ago: Manual transcription, often mechanical turk (1.5 $/min)
Recently: Many ML models for transcription (e.g., in Youtube, Alexa, Siri, Zoom)
PhD research on domain-specific speech recognition, that can detect technical jargon
DNN trained on public PBS interviews + transfer learning on smaller manually annotated domain-specific corpus
Research has shown amazing accuracy for talks in medicine, poverty and inequality research, and talks at Ruby programming conferences; published at top conferences
Idea: Let's commercialize the software and sell to academics and conference organizers
As a group, think about challenges that the team will likely focus when turning their research into a product:
- One machine-learning challenge
- One engineering challenge in building the product
- One challenge from operating and updating the product
- One team or management challenge
- One business challenge
- One safety or ethics challenge
Post answer to #lecture
on Slack and tag all group members (skip if nobody in group has slack set up yet)
Notes: Highlights challenging fragments. Can see what users fix inplace to correct. Star rating for feedback.
<style> text { font: 60px sans-serif; } </style> Data Scientists Software Engineers
- Often fixed dataset for training and evaluation (e.g., PBS interviews)
- Focused on accuracy
- Prototyping, often Jupyter notebooks or similar
- Expert in modeling techniques and feature engineering
- Model size, updateability, implementation stability typically does not matter
- Builds a product
- Concerned about cost, performance, stability, release time
- Identify quality through customer satisfaction
- Must scale solution, handle large amounts of data
- Detect and handle mistakes, preferably automatically
- Maintain, evolve, and extend the product over long periods
- Consider requirements for security, safety, fairness
By Steven Geringer, via Ryan Orban. Bridging the Gap Between Data Science & Engineer: Building High-Performance Teams. 2016
Broad-range generalist + Deep expertise
Figure: Jason Yip. Why T-shaped people?. 2018
Broad-range generalist + Deep expertise
Example:
- Basic skills of software engineering, business, distributed computing, and communication
- Deep skills in deep neural networks (technique) and medical systems (domain)
- What does correctness or accuracy really mean? What accuracy do customers care about?
- How can we see how well we are doing in practice? How much feedback are customers going to give us before they leave?
- Can we estimate how good our transcriptions are? How are we doing for different customers or different topics?
- How to present results to the customers (including confidence)?
- When customers complain about poor transcriptions, how to prioritize and what to do?
- What are unacceptable mistakes and how can they be avoided? Is there a safety risk?
- Can we cope with an influx of customers?
- Will transcribing the same audio twice produce the same result? Does it matter?
- How can we debug and fix problems? How quickly?
- With more customers, transcriptions are taking longer and longer -- what can we do?
- Transcriptions sometimes crash. What to do?
- How do we achieve high availability?
- How can we see that everything is going fine and page somebody if it is not?
- We improve our entity detection model but somehow system behavior degrades... Why?
- Tensorflow update; does our infrastructure still work?
- Once somewhat successful, how to handle large amounts of data per day?
- Buy more machines or move to the cloud?
- Models are continuously improved. When to deploy? Can we roll back?
- Can we offer live transcription as an app? As a web service?
- Can we get better the longer a person talks? Should we then go back and reanalyze the beginning? Will this benefit the next upload as well?
- How many domains can be supported? Do we have the server capacity?
- How specific should domains be? Medical vs "International Conference on Allergy & Immunology"?
- How to make it easy to support new domains?
- Can we handle accents?
- Better recognition of male than female speakers?
- Can and should we learn from customer data?
- How can we debug problems on audio files we are not allowed to see?
- Any chance we might private leak customer data?
- Can competitors or bad actors attack our system?
17-445/17-645/17-745/11-695, Spring 2024, 12 units
Monday/Wednesdays 2-3:20pm
Recitation Fridays 9:30am, 11am, and 2pm
- Email us or ping us on Slack (invite link on Canvas)
- All announcements through Slack
#announcements
- Weekly office hours, starting next week, schedule on Canvas
- Post questions on Slack
- Please use
#general
or#assignments
and post publicly if possible; your classmates will benefit from your Q&A!
- Please use
- All course materials (slides, assignments, old midterms) available on GitHub and course website: https://mlip-cmu.github.io/s2024/
- Pull requests encouraged!
Focused on engineering judgment
Arguments, tradeoffs, and justification, rather than single correct answer
Practical engagement, building systems, testing, automation
Strong teamwork component
Both text-based and code-based homework assignments
Some machine-learning experience required
- Basic understanding of data science process, incl. data cleaning, feature engineering, using ML libraries
- High level understand of machine-learning approaches
- supervised learning
- regression, decision trees, neural networks
- accuracy, recall, precision, ROC curve
- Ideally, some experience with notebooks, sklearn or other frameworks
Basic programming and command-line skills will be needed
No further software-engineering knowledge required
- Teamwork experience in product team is useful but not required
- No required exposure to requirements, software testing, software design, continuous integration, containers, process management, etc
- If you are familiar with these, there will be some redundancy -- sorry!
"Coding warmup assignment"
Out now, due Monday Jan 29
Enhance simple web application with ML-based features: Image search and automated captioning
Open ended coding assignment, change existing code, learn new APIs
Case study driven
Discussions highly encouraged
Regular in-class activities, breakouts
Contribute your own experience!
Discussions over definitions
Try to attend lecture -- discussions are important to learning
Participation is part of your grade
No lecture recordings, textbook and slides available
Contact us for accommodations (illness, interview travel, unforseen events) or have your advisor reach out. We try to be flexible
Participation != Attendance
Grading:
- 100%: Participates actively at least once in most lectures by (1) asking or responding to questions or (2) contributing to breakout discussions
- 90%: Participates actively at least once in two thirds of the lectures
- 75%: Participates actively at least once in over half of the lectures
- 50%: Participates actively at least once in one quarter of the lectures
- 20%: Participates actively at least once in at least 3 lectures.
Building Intelligent Systems by Geoff Hulten
https://www.buildingintelligentsystems.com/
Most chapters assigned at some point in the semester
Supplemented with research articles, blog posts, videos, podcasts, ...
Electronic version in the library
Short essay questions on readings, due before start of lecture (Canvas quiz)
Planned for: about 30-45 min for reading, 15 min for discussing and answering quiz
"Machine Learning in Production: From Models to Products"
Mostly similar coverage to lecture
Not required, use as supplementary reading
Published online (and in book form next year)
Most assignments available on GitHub now
Series of 4 small to medium-sized individual assignments:
- Engage with practical challenges
- Analyze risks, fairness
- Reason about tradeoffs and justify your decisions
- Mostly written reports, a little modeling, some coding
Large team project with 4 milestones:
- Build and deploy a prediction (movie recommendation) service
- Testing in production, monitoring
- Final presentation
Usually due Monday night; see schedule
Research project instead of individual assignments I3 and I4
Design your own research project and write a report
- A case study, empiricial study, literature survey, etc.,
Very open ended: Align with own research interests and existing projects
See the project requirements and talk to us
First hard milestone: initial description due Feb 27
Introducing various tools, e.g., fastAPI (serving), Kafka (stream processing), Jenkins (continuous integration), MLflow (experiment tracking), Docker & Kubernetis (containers), Prometheus & Grafana (monitoring), CHAP (explainability)...
Hands on exercises, bring a laptop
Often introducing tools useful for assignments
about 1h of work, graded pass/fail, low stakes, show work to TA
First lab on this Friday: Calling, securing, and creating APIs
We recommend to start at lab before the recitation, but can be completed during
Graded pass/fail by TA on the spot, can retry
Relaxed collaboration policy: Can work with others before and during recitation, but have to present/explain solution to TA individually
(Think of recitations as mandatory office hours)
- 35% individual assignment
- 30% group project with final presentation
- 10% midterm
- 10% participation
- 10% reading quizzes
- 5% labs
- No final exam (final presentations will take place in that timeslot)
Expected grade cutoffs in syllabus (>82% B, >94 A-, >96% A, >99% A+)
Specification grading, based in adult learning theory
Giving you choices in what to work on or how to prioritize your work
We are making every effort to be clear about expectations (specifications), will clarify if you have questions
Assignments broken down into expectations with point values, each graded pass/fail
Opportunities to resubmit work until last day of class
8 individual tokens per student:
- Submit individual assignment 1 day late for 1 token (after running out of tokens 15% penalty per late day)
- Redo individual assignment for 3 token
- Resubmit or submit reading quiz late for 1 token
- Redo or complete a lab late for 1 token (show in office hours)
- Remaining tokens count toward participation
8 team tokens per team:
- Submit milestone 1 day late for 1 token (no late submissions accepted when out of tokens)
- Redo milestone for 3 token
- No need to tell us if you plan to submit very late. We will assign 0 and you can resubmit
- Instructions and Google form for resubmission on Canvas
- We will automatically use remaining tokens toward participation and quizzes at the end
- Remaining individual tokens reflected on Canvas, for remaining team tokens ask your team mentor.
Instructor-assigned teams
Teams stay together for project throughout semester, starting Feb 5
Fill out Catme Team survey before Feb 5 (3pt)
Some advice in lectures; we'll help with debugging team issues
TA assigned to each team as mentor; mandatory debriefing with mentor and peer grading on all milestones (based on citizenship on team)
Bonus points for social interaction in project teams
See web page
In a nutshell: do not copy from other students, do not lie, do not share or publicly release your solutions
In group work, be honest about contributions of team members, do not cover for others
Collaboration okay on labs, but not quizzes, individual assignments, or exams
If you feel overwhelmed or stressed, please come and talk to us (see syllabus for other support opportunities)
GPT4, ChatGPT, CoPilot...? Reading quizzes, homework submissions, ...?
This is a course on responsible building of ML products. This includes questions of how to build generative AI tools responsibly and discussing what use is ethical.
Feel free to use them and explore whether they are useful. Welcome to share insights/feedback.
Warning: Be aware of hallucinations. Requires understanding to check answers. We test them ourselves and they often generate bad/wrong answers for reading quizzes.
You are responsible for the correctness of what you submit!
Note: Source: https://www.aiweirdness.com/do-neural-nets-dream-of-electric-18-03-02/
/**
Return the text spoken within the audio file
????
*/
String transcribe(File audioFile);
We routinely build:
- Safe software with unreliable components
- Cyberphysical systems
- Non-ML big data systems, cloud systems
- "Good enough" and "fit for purpose" not "correct"
ML intensifies our challenges
Before the next lecture, introduce yourself in Slack channel #social
:
- Your (preferred) name
- In 1~2 sentences, your data science background and goals (e.g., coursework, internships, work experience)
- In 1~2 sentences, your software engineering background, if any, and goals (e.g., coursework, internships, work experience)
- One topic you are particularly interested in learning during this course?
- A hobby or a favorite activity outside school
Machine learning components are part of larger systems
Data scientists and software engineers have different goals and focuses
- Building systems requires both
- Various qualities are relevant, beyond just accuracy
Machine learning brings new challenges and intensifies old ones