I have a Substack :)
https://mikexcohen.substack.com/
My substack posts are each 1000-3000 words (5-15 minutes to read), and focus on topics in machine-learning, LLMs, applied math, and related technical topics.
Each post has an accompanying code file that will reproduce and extend the analyses described in the post. I wrote the code files in Google Colab, and therefore, running then in Colab is the easiest way to ensure reproducibility and library installations.
The technical posts are organized into two sections, one about data science and one about large language model mechanisms.
Explore core concepts in data science and applied math through clear explanations, equations, and hands-on code. Each post unpacks a single topic, ranging from correlation and covariance to Fourier transforms, fractals, and neural simulations, translating between theory and Python implementation. Every post comes with a Python notebook so you can reproduce the results, experiment with the methods, and apply them to your own projects.
(Hint: Press ctrl or command while clicking the links to open in a new tab.)
Post title | Code file | Brief description |
---|---|---|
Correlation vs. cosine similarity | Correlation_vs_cosineSimilarity.ipynb | Simulate data to learn the math and implementations of correlation and cosine similarity. |
Zipf's law in famous fiction: characters and GPT4 tokens | ZipfsLaw_charactersTokens.ipynb | Explore character and subword (GPT4 tokens) frequencies in famous fiction books. |
The Fourier transform, explained with for-loops | Fourier_with_forloops.ipnyb | Learn how the Fourier transform works, using for-loops in Python. |
Understand how large language models (LLMs) really work by applying machine learning (ML) methods to their internal activations. Each post explores how LLMs process text, isolate patterns, and generate new outputs. You’ll learn how to probe, manipulate, and explain model internals. Every article includes a complete Python notebook so you can reproduce the results, visualize the mechanisms, and extend the experiments further.
(Hint: Press ctrl or command while clicking the links to open in a new tab.)
Post title | Code file | Brief description |
---|---|---|
Drawing text heatmaps to visualize LLM calculations | textHeatmaps_GPT2.ipynb | Learn how to create text heatmaps, and then use them to visualize GPT2 next-token predictions. |