Skip to content

Commit

Permalink
Automated documentation update.
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 388761940
  • Loading branch information
TensorFlow Datasets Team authored and Copybara-Service committed Aug 4, 2021
1 parent 9fbb1c4 commit 987063d
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions docs/catalog/ogbg_molpcba.md
Expand Up @@ -3,7 +3,7 @@
<meta itemprop="name" content="TensorFlow Datasets" />
</div>
<meta itemprop="name" content="ogbg_molpcba" />
<meta itemprop="description" content="&#x27;ogbg-molpcba&#x27; is a molecular dataset sampled from PubChem BioAssay.&#10;It is a graph prediction dataset from the Open Graph Benchmark (OGB).&#10;&#10;This dataset is experimental, and the API is subject to change in&#10;future releases.&#10;&#10;The below description of the dataset is adapted from the OGB paper:&#10;&#10;### Input Format&#10;All the molecules are pre-processed using RDKit ([1]).&#10;&#10;* Each graph represents a molecule, where nodes are atoms, and edges are&#10; chemical bonds.&#10;* Input node features are 9-dimensional, containing atomic number and chirality,&#10; as well as other additional atom features such as formal charge and&#10; whether the atom is in the ring.&#10;* Input edge features are 3-dimensional, containing bond type,&#10; bond stereochemistry, as well as an additional bond feature indicating&#10; whether the bond is conjugated.&#10;&#10;The exact description of all features is available at&#10;https://github.com/snap-stanford/ogb/blob/master/ogb/utils/features.py.&#10;&#10;### Prediction&#10;The task is to predict 128 different biological activities (inactive/active).&#10;See [2] and [3] for more description about these targets.&#10;Not all targets apply to each molecule: missing targets are indicated by NaNs.&#10;&#10;### References&#10;&#10;[1]: Greg Landrum, et al. &#x27;RDKit: Open-source cheminformatics&#x27;.&#10; URL: https://github.com/rdkit/rdkit&#10;&#10;[2]: Bharath Ramsundar, Steven Kearnes, Patrick Riley, Dale Webster,&#10; David Konerding and Vijay Pande. &#x27;Massively Multitask Networks for&#10; Drug Discovery&#x27;.&#10; URL: https://arxiv.org/pdf/1502.02072.pdf&#10;&#10;[3]: Zhenqin Wu, Bharath Ramsundar, Evan N Feinberg, Joseph Gomes,&#10; Caleb Geniesse, Aneesh S. Pappu, Karl Leswing, and Vijay Pande.&#10; MoleculeNet: a benchmark for molecular machine learning.&#10; Chemical Science, 9(2):513-530, 2018.&#10;&#10;To use this dataset:&#10;&#10;```python&#10;import tensorflow_datasets as tfds&#10;&#10;ds = tfds.load(&#x27;ogbg_molpcba&#x27;, split=&#x27;train&#x27;)&#10;for ex in ds.take(4):&#10; print(ex)&#10;```&#10;&#10;See [the guide](https://www.tensorflow.org/datasets/overview) for more&#10;informations on [tensorflow_datasets](https://www.tensorflow.org/datasets).&#10;&#10;" />
<meta itemprop="description" content="&#x27;ogbg-molpcba&#x27; is a molecular dataset sampled from PubChem BioAssay.&#10;It is a graph prediction dataset from the Open Graph Benchmark (OGB).&#10;&#10;This dataset is experimental, and the API is subject to change in&#10;future releases.&#10;&#10;The below description of the dataset is adapted from the OGB paper:&#10;&#10;### Input Format&#10;All the molecules are pre-processed using RDKit ([1]).&#10;&#10;* Each graph represents a molecule, where nodes are atoms, and edges are&#10; chemical bonds.&#10;* Input node features are 9-dimensional, containing atomic number and chirality,&#10; as well as other additional atom features such as formal charge and&#10; whether the atom is in the ring.&#10;* Input edge features are 3-dimensional, containing bond type,&#10; bond stereochemistry, as well as an additional bond feature indicating&#10; whether the bond is conjugated.&#10;&#10;The exact description of all features is available at&#10;https://github.com/snap-stanford/ogb/blob/master/ogb/utils/features.py.&#10;&#10;### Prediction&#10;The task is to predict 128 different biological activities (inactive/active).&#10;See [2] and [3] for more description about these targets.&#10;Not all targets apply to each molecule: missing targets are indicated by NaNs.&#10;&#10;### References&#10;&#10;[1]: Greg Landrum, et al. &#x27;RDKit: Open-source cheminformatics&#x27;.&#10; URL: https://github.com/rdkit/rdkit&#10;&#10;[2]: Bharath Ramsundar, Steven Kearnes, Patrick Riley, Dale Webster,&#10; David Konerding and Vijay Pande. &#x27;Massively Multitask Networks for&#10; Drug Discovery&#x27;.&#10; URL: https://arxiv.org/pdf/1502.02072.pdf&#10;&#10;[3]: Zhenqin Wu, Bharath Ramsundar, Evan N Feinberg, Joseph Gomes,&#10; Caleb Geniesse, Aneesh S. Pappu, Karl Leswing, and Vijay Pande.&#10; MoleculeNet: a benchmark for molecular machine learning.&#10; Chemical Science, 9(2):513-530, 2018.&#10;&#10;To use this dataset:&#10;&#10;```python&#10;import tensorflow_datasets as tfds&#10;&#10;ds = tfds.load(&#x27;ogbg_molpcba&#x27;, split=&#x27;train&#x27;)&#10;for ex in ds.take(4):&#10; print(ex)&#10;```&#10;&#10;See [the guide](https://www.tensorflow.org/datasets/overview) for more&#10;informations on [tensorflow_datasets](https://www.tensorflow.org/datasets).&#10;&#10;&lt;img src=&quot;https://storage.googleapis.com/tfds-data/visualization/fig/ogbg_molpcba-0.1.2.png&quot; alt=&quot;Visualization&quot; width=&quot;500px&quot;&gt;&#10;&#10;" />
<meta itemprop="url" content="https://www.tensorflow.org/datasets/catalog/ogbg_molpcba" />
<meta itemprop="sameAs" content="https://ogb.stanford.edu/docs/graphprop" />
<meta itemprop="citation" content="@inproceedings{DBLP:conf/nips/HuFZDRLCL20,&#10; author = {Weihua Hu and&#10; Matthias Fey and&#10; Marinka Zitnik and&#10; Yuxiao Dong and&#10; Hongyu Ren and&#10; Bowen Liu and&#10; Michele Catasta and&#10; Jure Leskovec},&#10; editor = {Hugo Larochelle and&#10; Marc Aurelio Ranzato and&#10; Raia Hadsell and&#10; Maria{-}Florina Balcan and&#10; Hsuan{-}Tien Lin},&#10; title = {Open Graph Benchmark: Datasets for Machine Learning on Graphs},&#10; booktitle = {Advances in Neural Information Processing Systems 33: Annual Conference&#10; on Neural Information Processing Systems 2020, NeurIPS 2020, December&#10; 6-12, 2020, virtual},&#10; year = {2020},&#10; url = {https://proceedings.neurips.cc/paper/2020/hash/fb60d411a5c5b72b2e7d3527cfc84fd0-Abstract.html},&#10; timestamp = {Tue, 19 Jan 2021 15:57:06 +0100},&#10; biburl = {https://dblp.org/rec/conf/nips/HuFZDRLCL20.bib},&#10; bibsource = {dblp computer science bibliography, https://dblp.org}&#10;}" />
Expand Down Expand Up @@ -104,7 +104,8 @@ FeaturesDict({

* **Figure**
([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):
Not supported.

<img src="https://storage.googleapis.com/tfds-data/visualization/fig/ogbg_molpcba-0.1.2.png" alt="Visualization" width="500px">

* **Examples**
([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):
Expand Down

0 comments on commit 987063d

Please sign in to comment.