[Model] Refine GraphSAINT #3328

Check the basic pipeline of codes. Next to check the details of samplers , GCN layer (forward propagation) and loss (backward propagation)

There're still some bugs with sampling in training procedure

Succeed in testing validity on ppi_node experiments without testing other setup. 1. Online sampling on ppi_node experiments performs perfectly. 2. Sampling speed is a bit slow because the operations on [dgl.subgraphs], next step is to improve this part by putting the conversion into parallelism 3. Figuring out why offline+online sampling method performs bad, which does not make sense 4. Doing experiments on other setup

Use torch.dataloader to speed up saint sampling with experiments. Except experiments on too large dataset Amazon, we've done some experiments on other four datasets including ppi, flickr, reddit and yelp. Preliminary experimental results show consumed time and metrics reach not bad level. Next step is to employ more accurate profiler which is the line_profiler to test consumed period, and adjust num_workers to speed up sampling procedures on same certain datasets faster.

Reorganize some codes and comments.

Fix bugs about why fully offline sampling and author's version don't work

Reorganize files and codes then do some experiments to test the performance of offline sampling and online sampling

1. handle directory named 'graphsaintdata' 2. control graph shift between gpu and cpu related to large dataset ('amazon') 3. remove parameter 'train' 4. refine annotations of the sampler 5. update README.md including updating dataset info, dependencies info, etc

explain config differences in TEST part remove a sampling time variant make 'online' an argument change 'norm' to 'sampler' explain parameters in README.md

* make online an argument * refine README.md * refine codes of `collate_fn` in sampler.py, in training phase only return one subgraph, no need to check if the number of subgraphs larger than 1

check the problem on flickr is about overfitting.

Fix the overfitting problem of `flickr` dataset. We need to restrict the number of subgraphs (also the number of iterations) used in each epoch of training phase. Or it might overfit when validating at the end of each epoch. The method to limit the number is a formula specified by the author. * Set up a new flag `full` specifying if the number of subgraphs used in training phase equals to that of pre-sampled subgraphs * Modify codes and annotations related the new flag * Add a new parameter called `node_budget` in the base class `SAINTSampler` to compute the specific formula

* Finish the experiments on Flickr, which is done after adding new flag `full`

* use half of edges in the original graph to do sampling * test dgl.random.choice with or without replacement with half of edges ~ next is to test what if put the calculating probability part out of __getitem__ can speed up sampling and try to implement sampling method of author

* employ cython to implement edge sampling for per edge * doing experiments to test consumed time and performance ** the consumed time decreased to approximately 480s, the performance decrease about 5 points. * deprecate cython implementation

* This reverts commit 4ba4f09 * Deprecate cython implementation * Reserve half-edges mechanism

* delete unnecessary annotations

Commits on Jul 17, 2021

The start of experiments of Jiahang Li on GraphSAINT.

LspongebobJH committed Jul 17, 2021

Copy the full SHA

1ab071c View commit details

Browse the repository at this point in the history

Commits on Jul 22, 2021

a night build

LspongebobJH committed Jul 22, 2021

Copy the full SHA

3fb891b View commit details

Browse the repository at this point in the history

Commits on Aug 2, 2021

Update .gitignore

LspongebobJH committed Aug 2, 2021

Copy the full SHA

d99deac View commit details

Browse the repository at this point in the history

Commits on Aug 11, 2021

reorganize codes

Reorganize some codes and comments.

LspongebobJH committed Aug 11, 2021

Copy the full SHA

64f749f View commit details

Browse the repository at this point in the history

Commits on Aug 14, 2021

fix bugs

Fix bugs about why fully offline sampling and author's version don't work

LspongebobJH committed Aug 14, 2021

Copy the full SHA

b64005e View commit details

Browse the repository at this point in the history

Commits on Sep 8, 2021

Merge branch 'master' into jiahanli

LspongebobJH committed Sep 8, 2021

Copy the full SHA

c206af4 View commit details

Browse the repository at this point in the history

Commits on Oct 7, 2021

Merge branch 'master' into jiahanli

mufeili committed Oct 7, 2021

Copy the full SHA

e0c0a26 View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] Refine GraphSAINT #3328

[Model] Refine GraphSAINT #3328

Commits on Jul 17, 2021

Commits on Jul 19, 2021

Commits on Jul 22, 2021

Commits on Jul 27, 2021

Commits on Jul 30, 2021

Commits on Aug 2, 2021

Commits on Aug 11, 2021

Commits on Aug 13, 2021

Commits on Aug 14, 2021

Commits on Aug 17, 2021

Commits on Aug 18, 2021

Commits on Sep 6, 2021

Commits on Sep 7, 2021

Commits on Sep 8, 2021

Commits on Sep 14, 2021

Commits on Sep 15, 2021

Commits on Sep 17, 2021

Commits on Sep 23, 2021

Commits on Sep 26, 2021

Commits on Sep 29, 2021

Commits on Oct 7, 2021