Skip to content

Commit

Permalink
fix bulleted list
Browse files Browse the repository at this point in the history
  • Loading branch information
Justin Sybrandt committed May 18, 2020
1 parent 781982a commit a3274e4
Showing 1 changed file with 23 additions and 19 deletions.
42 changes: 23 additions & 19 deletions docs/help/embed_semantic_graph.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,25 +9,29 @@ PTBG is a complex tool that requires a number of preprocessing steps to use.

## PTBG Process Outline

1. Create a single directory that contains all semantic graph edges.
- This is produced by running `agatha.construct`.
- Edges are stored as small key-value json files.
- The directory may contain a large number of files.
2. Convert graph to PTBG input format.
- PTBG requires that we index and partition all nodes and edges.
- Look into `tools/convert_graph_for_pytorch_biggraph` for how to do this.
3. Create a PTBG config.
- Specify all node / edge types.
- Specify location of all input files.
- The parameters of this config must match the options used in the
conversion.
4. Launch the PTBG training cluster.
- Use 10-20 machines, too many will slow this process.
- Wait at least 5 epochs, will take days.
5. Index the resulting embeddings for use in Agatha.
- Agatha needs to know where to find each embedding, given the node name.
- Use `tools/py_scripts/ptbg_index_embeddings.py` to create a lookup table
that maps each node name to its embedding metadata.
1. Create a single directory that contains all semantic graph edges.
- This is produced by running `agatha.construct`.
- Edges are stored as small key-value json files.
- The directory may contain a large number of files.

2. Convert graph to PTBG input format.
- PTBG requires that we index and partition all nodes and edges.
- Look into `tools/convert_graph_for_pytorch_biggraph` for how to do this.

3. Create a PTBG config.
- Specify all node / edge types.
- Specify location of all input files.
- The parameters of this config must match the options used in the
conversion.

4. Launch the PTBG training cluster.
- Use 10-20 machines, too many will slow this process.
- Wait at least 5 epochs, will take days.

5. Index the resulting embeddings for use in Agatha.
- Agatha needs to know where to find each embedding, given the node name.
- Use `tools/py_scripts/ptbg_index_embeddings.py` to create a lookup table
that maps each node name to its embedding metadata.

## Convert Edges to PTBG format

Expand Down

0 comments on commit a3274e4

Please sign in to comment.