Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
128 changes: 128 additions & 0 deletions books/graph_ml_for_engineers/notebooks/1_graphs_intro.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Graphs\n",
"\n",
"Graphs are a general mechanism for describing and analyzing entities with relations and interactions\n",
"\n",
"Rather than thinking of the world as a set of isolated data points:\n",
"\n",
"| Entities | Feature 1 | Feature 2 |\n",
"| ----------- | ----------- | ----------- |\n",
"| A | 1000 | 1001 |\n",
"| B | 2000 | 2002 |\n",
"\n",
"We think of the these entities and their networks and relations between other entities\n",
"\n",
"| Entities | Feature 1 | Feature 2 | \n",
"| ----------- | ----------- | ----------- | \n",
"| A | 1000 | 1001 | \n",
"| B | 2000 | 2002 | \n",
"\n",
"➕\n",
"\n",
"| Src Entity | Dst Entity | Relation Type | Feature 2 | Feature 2 | \n",
"| ------------- | ------------- | ------------- | ------------- | ------------- | \n",
"| A | B | Is Child Of | 1 | 1 | \n",
"| B | A | Is Parent Of | 2 | 2 |\n",
"\n",
"\n",
"\n",
"## What is the issue with non-graph related ML Toolbox?\n",
"\n",
"Designed for tabular data, grids of data, or sequences of data.\n",
"- Text / Audio sequences has a notion of left & right\n",
"- Images has a notion of up / down & left / right\n",
"\n",
"Graphs have arbritary size and arbritary topology and has no spatial locality.\n",
"\n",
"\n",
"In traditional ML, we take our nodes, links and entire graph and represent them as vectors; then we train a classical ML model on them i.e. Random forest, SVM, NN, etc.\n",
"Thus that when a new node / link / graph appears we can obtain its features to make a prediction. Traditional ML uses hand crafted features which is what we will talk about below.\n",
"\n",
"Traditional way to do node prediction:\n",
"\n",
"Given a graph $G$, set of vertices $V$ and Edges $E$, where $G = (V, E)$, we want to learn a function $f : V \\rightarrow \\Reals$\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### What are some features we can extract for nodes?\n",
"\n",
"- Node Degree\n",
" - The degree $d_v$ of a node $v$ is the number of edges the nodes has.\n",
" - CONS: We treat all neighboring nodes equally.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Node Centrality\n",
" - Try to capture the importance of a node, can be modeled by:\n",
" - Engienvector centrality\n",
" - \n",
" - Betweenness centrality\n",
" - Closeness centrality"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Clustering Coefficient"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Clustering Coefficient"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## What are we trying to do?\n",
"\n",
"Given node $u$, we are trying to learn a nueral network $f$, such that we can generate a $d$ dimensional vector representaiton of the node $u$ : $f(u) \\rightarrow \\Reals^d$ ; where similar nodes have their vector representation spatially close to each other."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.9.12 ('graph-ml-for-engineers-9GQyHo6a-py3.9')",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.9.12"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "c5835645cbd39e77e80fd28b6a8a6b63c0a1f33699bc9c2aaafa2cbac9764660"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}
49 changes: 49 additions & 0 deletions books/graph_ml_for_engineers/notebooks/2_node_embeddings.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## What are we trying to do?\n",
"\n",
"Input Graph --> Feature Engineering --> Apply some learning algorithm --> Drive some prediction\n",
"Wtih Graph represeatnation learning we want to eliminate the \"Feature Engineering\" step, and automatically learn the features\n",
"\n",
"\n",
"## So whats the idea?\n",
"\n",
"Learn a $f: u \\rightarrow \\mathbb{R}^d$ ; a function $f$ that given a node $u$ maps that node to a $d$ dimensional vector in the $\\mathbb{R}$ (real) space\n",
"\n",
"\n",
"Given that we learn the function $f$, what can we do?\n",
"You can use them for downstream gtasks such as Node classification, Link prediction, Graph classification, Clustering, Anomoly detection, et al.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.9.12 ('graph-ml-for-engineers-9GQyHo6a-py3.9')",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.9.12"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "c5835645cbd39e77e80fd28b6a8a6b63c0a1f33699bc9c2aaafa2cbac9764660"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}
19 changes: 19 additions & 0 deletions books/graph_ml_for_engineers/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
[tool.poetry]
name = "graph-ml-for-engineers"
version = "0.1.0"
description = "Learning Graph Machine Learning made easy for Seasoned Engineers"
authors = ["shubhamvij <reachme@shubhamvij.com>"]
readme = "README.md"
packages = [{include = "graph_ml_for_engineers"}]

[tool.poetry.dependencies]
python = "^3.9"
matplotlib = "networkx"

[tool.poetry.group.dev.dependencies]
networkx = "^2.8.6"
matplotlib = "^3.5.3"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"