Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coordinating Tutorial Fixes #2112

Closed
rbharath opened this issue Aug 21, 2020 · 70 comments
Closed

Coordinating Tutorial Fixes #2112

rbharath opened this issue Aug 21, 2020 · 70 comments

Comments

@rbharath
Copy link
Member

The tutorials need a thorough scrubbing. Let's use this issue to coordinate the needed work and sign up for fixing various tutorials.

CC @peastman @neel-shah

@peastman
Copy link
Contributor

I'll start on the two GAN tutorials. The CGAN one is very out of date. I think it would be best to convert it to use the GAN class.

@peastman
Copy link
Contributor

I'm starting on tutorial 2.

@peastman
Copy link
Contributor

I'm looking into tutorial 12 now.

I suggest we just get rid of tutorial 20, the one on converting models to estimators. We don't support make_estimator() anymore, and estimators are mostly outdated at this point anyway.

@peastman
Copy link
Contributor

I'm getting an error in tutorial 12 that I'm not sure what to do about. When it tries to evaluate the performance on crystal_dataset it gets this error:

ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.

Looking at the original data file desc_canvas_aug30.csv I can confirm this is true. The code tells it to load the Class column as y, and every row in the file contains 1 for that column. I don't see how this could ever have worked.

@rbharath
Copy link
Member Author

+1 for getting rid of tutorial 20! Estimators seem to be on the way out in TensorFlow so seems less useful.

Good point on tutorial 12. Perhaps we can switch the metric to something like recall on the positive class or FPR/TPR that would be meaningful for a dataset with only positives? (Assuming the datapoints are positives)

@peastman
Copy link
Contributor

Why is the tutorial testing on a dataset with only positives in the first place? That doesn't seem like a very useful test. Can you explain what the crystal dataset is?

@rbharath
Copy link
Member Author

That tutorial was based on this paper (https://pubs.acs.org/doi/abs/10.1021/acs.jcim.6b00290). It's been a few years, but I recall that that dataset was testing whether the model trained on assay data could generalize to a separate dataset (compounds which had been co-crystallized with protein/ligand binding). Since these were crystal structures, they were only positives (since we only had binding structures).

If this is getting too into the weeds, might be totally fine just to remove from the tutorial. It might be too much into the weeds for beginners and may not be helping build comprehension

@peastman
Copy link
Contributor

What if we just remove the checks against that dataset? We already print results for the train, validation, and test datasets. That seems like plenty for a tutorial.

@rbharath
Copy link
Member Author

Sounds good!

@peastman
Copy link
Contributor

I'm reaching the conclusion that notebook is just a mess. I'm trying to figure out what it's doing and why, but with only partial success. Here's what I've figured out so far.

It begins by downloading two CSV files and loading them into memory. It confusingly assigns them to variables called dataset and crystal_dataset, even though they're actually Pandas DataFrames, not Datasets. There's no explanation for any of this code, including any description of what's in those files it's loading.

It plots a bunch of molecules from both files, but there's no explanation of why it does this or what we're supposed to learn from them other than the one vague sentence, "Note the complex ring structures and polar structures."

It now uses a UserCSVLoader to create actual Datasets from those files. Just to be extra confusing, it assigns them to the same variables it previously used for the DataFrames, dataset and crystal_dataset. It specifies that the "Class" column should be loaded as the labels, but gives no information about what's contained in that column.

It splits it into train, validation, and test datasets. It comments without explanation that the test set is really a validation set and the validation set is really a test set. I have no idea what it's talking about. It also applies normalization and clipping transformers. The text says these are being applied to the pIC50 values, which is false: as noted above, the Dataset used the "Class" column, not the "pIC50" column.

It fits RandomForestClassifiers and MultitaskClassifiers to the training set (again emphasizing that the labels are classes, not continuous values, so the transformers made no sense) and computes the AUC on the various datasets. As noted above, this produces an exception for the crystal dataset because every sample has the same class. I don't see any way this code could ever have worked.

Now it loads the files yet again, only this time loading the "pIC50" column. (And yet again reusing the same variables!) It applies the transformers, fits regression models, and computes R^2 on them. Except that it gets an R^2 of 0 for the crystal dataset, because that file doesn't contain any pIC50 values. The column is completely blank.

By the way, it refers to the dataset as "BACE". Is that the same BACE dataset that's already included in molnet? If so, why go through the business of downloading csv files rather than just loading it from molnet?

Anyway, as it currently exists this notebook is not a tutorial. It's just a bunch of code, some of it nonfunctional, with very little explanation of what it's doing or why. Perhaps it could be turned into a tutorial, but that would involve a major rewrite. If you can explain what's going on with the data, and what we want the user to learn from reading the tutorial, I could attempt to do that.

@rbharath
Copy link
Member Author

I agree, this one is a mess! This is the same BACE that we have in MoleculeNet. I think the original version of this tutorial predates the moleculenet version (hence the archaic loading). Perhaps one way we could cleanly rework it is to just make it a tutorial about working with the BACE dataset from MoleculeNet.

The difference between this and the earlier solubility tutorials would be that we would be learning to predict protein-ligand interaction instead of a chemical property, but structurally the tutorial should probably match our earlier tutorial for property prediction on MoleculeNet

@neel-shah
Copy link
Contributor

neel-shah commented Aug 25, 2020 via email

@peastman
Copy link
Contributor

I'll also do tutorial 5 while I'm at it. It seems to have some of the same problems.

@neel-shah
Copy link
Contributor

I am trying to fix the tutorial 4.

It seems like I need to have some understanding of graph conv model and also featurization for the model. I am trying to understand the relevant parts of keras library for this tutorial and also the model through code. I think it will be much easier for me to go through some material to understand these things. For example I am not exactly sure what degree_slice array contains (actually I got confused after looking into the code). Also it looks like I might need to know how the degree slice is used in the model (at least to fix tutorial 4).

Is there some more documentation for it? Or some other material that I can read (I am interested in both understanding graph conv model and featurization for it)? If not, I can try to spend more time through Tensorflow, Keras documentation and DeepChem code.

@peastman
Copy link
Contributor

On tutorial 5 I'm running into the same error about, "Only one class present in y_true." This one uses the MUV dataset. Out of 93,000 samples, no task has more than 30 positives, and a lot have even less. And there are 17 tasks. The odds are very high that either the validation set or test set will randomly end up with no positives at all for at least one task, and then we get an error.

How do you suggest dealing with this?

@rbharath
Copy link
Member Author

@neel-shah Unfortunately there isn't good documentation for the usage of these intermediate arrays. I've written up some docs for WeaveLayer here that explain similar but not identical intermediate arrays (https://deepchem.readthedocs.io/en/latest/layers.html#deepchem.models.layers.WeaveLayer) that might be a good first starting point. In general, degree_slice and the like contain bookkeeping about the molecular graph (various ways of encoding which atoms are bonded etc. It would be really useful to improve the documentation for GraphConv as we've done for WeaveLayer explaining what each of these arrays does. From the top of my head, I recall that degree_slice does something like separate out atoms which have 0 bonds, 1 bonds, 2 bonds, etc

@rbharath
Copy link
Member Author

@peastman Perhaps RandomStratifiedSplitter might help? https://deepchem.readthedocs.io/en/latest/splitters.html#deepchem.splits.RandomStratifiedSplitter. I think this class tries to guarantee that valid/test has at least some number of positives. That said, it looks like we haven't cleaned up RandomStratifiedSplitter in a while so it might need a bit of sprucing up to solve our issue here

@peastman
Copy link
Contributor

load_muv(split='stratified') throws an exception from down inside the splitter. I'll see if I can figure out why.

@peastman
Copy link
Contributor

Yet another issue I'm running into on tutorial 12. If I tell it to use a random split, the random forest and MultitaskRegressor both do reasonably well:

RF Train set R^2 0.949594
RF Valid set R^2 0.737146
RF Test set R^2 0.663517

DNN Train set R^2 0.858852
DNN Valid set R^2 0.705035
DNN Test set R^2 0.630337

But the documentation for BACE recommends using scaffold splitting instead. (So why isn't that the default? Seems like a bug to me.) And when I do that, the generalization performance is terrible. In fact, the validation set comes out strongly anticorrelated with the model predictions.

RF Train set R^2 0.951580
RF Valid set R^2 -0.785601
RF Test set R^2 0.469812

DNN Train set R^2 0.777881
DNN Valid set R^2 -1.160022
DNN Test set R^2 0.400530

Any suggestions?

@neel-shah
Copy link
Contributor

I've written up some docs for WeaveLayer here that explain similar but not identical intermediate arrays

Thanks @rbharath, the docs helped indeed. I think I now have decent understanding about graph conv to get through tutorial 4 (or at least I hope).

Although I am stuck at an error which looks like the code in the tutorial should never have worked in the first place or there are some major changes happened underneath. So at one point in tutorial 4 it talks about creating graph conv model from scratch. For that we first build the layers of the model

import tensorflow as tf
import tensorflow.keras.layers as layers

atom_features = layers.Input(shape=(75,))
degree_slice = layers.Input(shape=(2,), dtype=tf.int32)
membership = layers.Input(shape=tuple(), dtype=tf.int32)

deg_adjs = []
for i in range(0, 10 + 1):
    deg_adj = layers.Input(shape=(i+1,), dtype=tf.int32)
    deg_adjs.append(deg_adj)

and we build the model using DeepChem's graph conv model class

from deepchem.models.layers import GraphConv
gc1 = GraphConv(64, activation_fn=tf.nn.relu)([atom_features, degree_slice, membership] + deg_adjs)

which throws the following error

/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py:2023 split
    raise ValueError("Cannot infer num from shape %s" % num_or_size_splits)

ValueError: Cannot infer num from shape Tensor("graph_conv_5/strided_slice:0", shape=(None,), dtype=int32)

The error looks legit to me because we are trying to split atom_features object which is actually keras input layer object, i.e. it does not contain the actual features from a dataset.

Do I get it right? Maybe we should instantiate deepchem.models.layers.GraphConv with keras input layers and then invoke call() method with the actual dataset inputs?

@peastman
Copy link
Contributor

Just a guess: try changing the definition of membership to

membership = layers.Input(shape=(1,), dtype=tf.int32)

@neel-shah
Copy link
Contributor

Thanks!

I still see the same error after changing membership to this

membership = layers.Input(shape=(1,), dtype=tf.int32)

The error comes from this line.

@rbharath
Copy link
Member Author

rbharath commented Aug 31, 2020

@neel-shah The issue here is that this tutorial was originally written for TF1.x. Unfortunately, for TF 2.2, we had to change GraphConvModel to use the eager subclassing style instead of the keras style for the implementation to work. This meant that the previously functional code didn't work. I think the way to fix this is to no longer use Input objects but directly use TensorFlow eager style to evaluate the custom graph conv model. Check out the discussion in this issue #2013 and see my comment for a first step towards a fix #2013 (comment). Hope that helps!

@rbharath
Copy link
Member Author

rbharath commented Aug 31, 2020

@peastman Hmm, that's really weird. From a purely scientific viewpoint, scaffold split is better since it tests the ability of the model to generalize to new scaffolds, but it's clear the model doesn't generalize well to new scaffolds. It's really strange to see the anti-correlation. Perhaps one way to handle this in the tutorial would be to make it pedagogical. That is, we can show random split works well, then show scaffold split doesn't. And explain that this means the learned model doesn't actually generalize well to new scaffolds, so users should avoid using it to make predictions on structurally dissimilar molecules.

@peastman
Copy link
Contributor

peastman commented Sep 1, 2020

We discussed this a bit on Friday. I'm posting here to continue the discussion.

In addition to fixing broken code, a lot of the tutorials need deeper changes. Many of them weren't really designed to be tutorials. They were just notebooks containing sample code. We collected them together and arranged them in a sequence, but they don't really form a coherent set of lessons. I suggest we overhaul the full set of tutorials from beginning to end. Each one should have a specific lesson it teaches, and they should build on each other in a logical way so a user can learn to use DeepChem by reading through the tutorials in order.

We also discussed another tutorial we should add: one that introduces the KerasModel and TorchModel classes, showing users how to use TensorFlow and PyTorch based models with DeepChem. I can write that, if you want.

@rbharath
Copy link
Member Author

rbharath commented Sep 1, 2020

@peastman A tutorial introducing the KerasModel and TorchModel classes would be a great addition! That would be very welcome :)

I definitely agree that we need to overhaul the tutorials to form a coherent set of lessons as you suggest. The tutorial set has grown organically, which means we have a lot of cruft in there that doesn't really contribute. Do you have any thoughts on a potential coherent lesson structure? I can also take a look to see what might make sense

@peastman
Copy link
Contributor

peastman commented Sep 2, 2020

I suggest we make up a list of the tutorials we want to have, and what lessons the reader should learn from each one. For example,

  1. Introduce Dataset. Describe the methods of creating them and what you can do with them. (The current tutorial 1 covers a lot of this, but it also spends a lot of time on more advanced subjects like splitters. Those should be delayed to a later tutorial.)
  2. Introduce MoleculeNet. Describe the range of datasets it contains and show how to load them.
  3. Give a first simple example of creating a model (of some standard type) and training it.
  4. Introduce molecular fingerprints and train a model that uses them.
  5. Introduce graph models and train a model that uses them.
  6. Describe how to create your own models with KerasModel and TorchModel.
  7. Describe the featurizers in DeepChem.
  8. Describe the splitters in DeepChem.

And so on. A lot of our current tutorials cover more advanced and specialized subjects, which is fine. We just need to make sure they come later in the sequence, after the reader has already learned everything they need to understand them.

@neel-shah
Copy link
Contributor

I finally got the code in tutorial 4 to work, this is what model looks like:

class MyKerasModel(tf.keras.Model):

  def __init__(self):
    super(MyKerasModel, self).__init__()
    self.gc1 = GraphConv(64, activation_fn=tf.nn.relu)
    self.batch_norm1 = layers.BatchNormalization()
    self.gp1 = GraphPool()

    self.gc2 = GraphConv(64, activation_fn=tf.nn.relu)
    self.batch_norm2 = layers.BatchNormalization()
    self.gp2 = GraphPool()

    self.dense1 = layers.Dense(128, activation=tf.nn.relu)
    self.batch_norm3 = layers.BatchNormalization()
    self.readout = GraphGather(batch_size=batch_size, activation_fn=tf.nn.tanh)

    self.dense2 = layers.Dense(n_tasks*2)
    self.logits = layers.Reshape((n_tasks, 2))
    self.softmax = layers.Softmax()

  def call(self, inputs):
    a = self.gc1(inputs)
    b = self.batch_norm1(a)
    c = self.gp1([b] + inputs[1:])

    d = self.gc2([c] + inputs[1:])
    e = self.batch_norm1(d)
    f = self.gp2([e] + inputs[1:])

    g = self.dense1(f)
    h = self.batch_norm3(g)
    i = self.readout([h] + inputs[1:])

    j = self.logits(self.dense2(i))
    return self.softmax(j)

And this is the result:

Evaluating model
Training ROC-AUC Score: 0.537518
Valid ROC-AUC Score: 0.517265

To be honest I don't understand a lot of things getting executed under the hood. I will spend some time understanding those things which probably will also help to improve code description.

@proteneer
Copy link
Contributor

The Butina Clustering code itself has no way of specifying sizes. So you sort of just pick a similarity value you're happy with and see what comes backout. The code really wasn't meant for production

@peastman
Copy link
Contributor

peastman commented Oct 1, 2020

Would it be reasonable to make it work the same way scaffold splitter does? The two methods are conceptually pretty similar. Once you identify scaffolds or clusters, the code for assembling them into datasets could be mostly identical.

@proteneer
Copy link
Contributor

That seems reasonable to me.

@peastman
Copy link
Contributor

peastman commented Oct 7, 2020

Tutorials 13 and 14 both need a lot of changes. 13 is written to download the datafiles and process them by hand, instead of just calling load_pdbbind(). We should probably change it to use the molnet function. And most of tutorial 14 hasn't been written yet. It just ends with a todo. @rbharath you're probably the best person to do these?

@rbharath
Copy link
Member Author

rbharath commented Oct 8, 2020

Yep, absolutely, will take these on! I'm planning to hit tutorial 3 this week and will hit these two in the next couple of weeks.

@peastman
Copy link
Contributor

peastman commented Oct 8, 2020

Here's an updated version of the sequence and the status of each one.

  1. Basic Tools of the Deep Life Sciences - DONE
  2. Working with Datasets - DONE
  3. MoleculeNet - DONE
  4. Molecular Fingerprints - DONE
  5. Creating Models with TensorFlow and PyTorch - DONE
  6. Graph Convolutions - DONE
  7. Featurizers - DONE
  8. Splitters - DONE
  9. Advanced Model Training - DONE
  10. Creating Datasets - DONE
  11. Multitask Modelling - DONE
  12. Interaction Fingerprints (from current tutorial 13) - TODO Bharath
  13. Atomic Convolutions (from current tutorial 14) - TODO Bharath
  14. Conditional GAN - DONE
  15. MNIST GAN - DONE
  16. Learning Unsupervised Embeddings for Molecules - DONE
  17. Pretraining and Transfer Learning - TODO
  18. Language Modelling for Transfer Learning (from current tutorial 22) - TODO
  19. One Shot Learning - TODO
  20. Sequence Learning - TODO
  21. Quantum Chemistry (from current tutorial 10) - DONE
  22. Large Scale Chemical Screens (from current tutorial 19) - TODO
  23. Synthetic Feasibility Scoring (from current tutorial 15) - DONE
  24. Interpreting Deep Models (from current tutorial 8) - DONE
  25. Uncertainty in Deep Learning (from current tutorial 7) - DONE
  26. Normalizing Flows (from current unnumbered tutorial) - TODO
  27. Reinforcement Learning (from current tutorial 18) - DONE
  28. Introduction to Bioinformatics (from current tutorial 21) - TODO

There are a few current tutorials that don't appear anywhere in the sequence as currently outlined: learning embeddings with seq2seq (11), synthetic feasibility scoring (15), introduction to bioinformatics (21). What do we want to do with them?

Of the tutorials marked TODO, some need only minor revisions and moving into the correct position in the sequence. Others need bigger changes, and a few are completely new tutorials that need to be written.

[EDITED: Added in the other three tutorials. Updating status of tutorials as they get done.]

@peastman
Copy link
Contributor

There are a few current tutorials that don't appear anywhere in the sequence as currently outlined: learning embeddings with seq2seq (11), synthetic feasibility scoring (15), introduction to bioinformatics (21). What do we want to do with them?

Anyone have thoughts on this?

@rbharath
Copy link
Member Author

Hmm, good question. Here are a couple thoughts:

  1. Perhaps we can keep the seq2seq as an advanced topic? It can come before the transfer learning tutorial 16 as another unsupervised learning method, especially if we cover variational autoencoders.
  2. I'm honestly not sure how good the synthetic feasibility model is in practice. Perhaps we can make synthetic feasibility a "real world" tutorial that serves as a companion to tutorial 21 on large scale chemical screens?
  3. I think the introduction to bioinformatics tutorial could fit in at the end as an advanced topic? I'm hoping we can continue building out DeepChem support for genomics (like the models we have in the book). I was intending this as an introductory tutorial for subsequent genomic models but haven't had time to build that out yet.

@peastman
Copy link
Contributor

Thanks! I edited the list above to add them at the positions you suggested.

@ncfrey
Copy link
Contributor

ncfrey commented Nov 4, 2020

Since I'm working on docking, I'd be happy to take on Tutorial 12: Interaction Fingerprints (from current tutorial 13). This could be an expanded, end-to-end example of taking protein-ligand complexes, featurizing them with the interaction fingerprints from #2212, using dc.dock to generate docking scores, and then training a model to predict scores.

@rbharath
Copy link
Member Author

rbharath commented Nov 5, 2020

Awesome thanks @ncfrey!

@peastman
Copy link
Contributor

peastman commented Nov 5, 2020

I'm trying to update the tutorial on model interpretability (currently number 8). Things are fine up until cell In [14], which throws an exception.

# We can visualize which fingerprints our model declared toxic for the
# active molecule we are investigating
Chem.MolFromSmiles(list(my_fp[242])[0])

That magic number 242 seems to come out of nowhere. It appears in this and the following cells, none of which has any explanation. Many of them don't even produce output. The tutorial then declares we've learned something useful, but if so, I don't know what it is! Any explanation of what's going on?

@rbharath
Copy link
Member Author

Ah this completely slipped through my inbox!

My understanding of this tutorial is that it really is a magic number. The my_fp is a dictionary with datapoint ids mapped to fragments for that molecule (so in this case the fragments for the 242nd molecule in the source dataset). I think when I (or someone else) was originally writing this code, the intent was just to pull out some of the fragments to display them for some pretty pictures and I just pulled out a few different molecule until one displayed.

Honestly, I'm not sure we've actually learned anything here. It's useful to look at the contributing fragments and I think it does help chemists who have a direct intuition from the molecules, but it is more pretty pictures than anything. I'm super open to ways we could change up this tutorial so it is more useful and actually teaches something good.

@peastman
Copy link
Contributor

After more work on the interpretability tutorial, I'm starting to have doubts about whether the approach it uses is useful at all. I get a lot of outputs like this:

image

That shows that this particular model is strongly predicted to be toxic. But when it breaks the prediction down into signals from particular fragments, all the top ones indicate it should be non-toxic! How can that be? Well, when I scroll down further on the list I do start to get lots of ones that lead to the toxic prediction. But the prediction doesn't come from a small number of fragments that give clear signals. It's a sum of small contributions from hundreds of fragments, with slightly more of those fragments indicating toxic than non-toxic. Furthermore, very few of the contributions are from fragments that are present. Most of the prediction is coming from fragments that aren't present (value of 0.00 along the left side). Perhaps someone more experienced with this tool would know how to get useful insight from it, but I don't.

@rbharath
Copy link
Member Author

CC @lilleswing who has more experience with interpretability and might be able to help out :)

I have to confess that I've not had great luck with this interpretabilty technique either, for reasons similar to those you've mentioned. I think sometimes there's a clear signal from one fragment that provides the deciding "vote", but for a lot of cases it seems to be the long tail of many contributions.

As a quick question, are the non-present fragments coming from the ECFP hash? That is, it's a sort of hash collision leading the tool to think a fragment with the same hash but not present in the molecule is contributing signal?

Perhaps one way we can address this in the tutorial is to just be up-front about these issues and say that "user beware"?

@peastman
Copy link
Contributor

I'm not entirely sure, but I suspect it's more a case of, "This fragment is associated with molecules being toxic. Since it isn't present, that's a strong signal that this molecule is non-toxic." LIME works by building a local linear model, so it can't deal very well with non-linear logic like, "The molecule is toxic if any one of these fragments is present."

@peastman
Copy link
Contributor

peastman commented Dec 8, 2020

I was looking into the bioinformatics tutorial. I'm a bit confused, because it has nothing to do with DeepChem. It's a tutorial on how to do basic sequence manipulation with BioPython. It does end with a cell that says, "TODO(rbharath): Continue filling this up in future PRs." Was this meant to be just the start of a tutorial, and it never got finished? If so, what else was it meant to include?

@rbharath
Copy link
Member Author

rbharath commented Dec 9, 2020

@peastman Yes you're right! Unfortunately this was to be the start of a sequence and it never got finished. The goal was to work up to the material we present in the book chapters on bioinformatics (or other ML in bioinformatics applications like transcription factor binding site prediction)

@arunppsg
Copy link
Contributor

arunppsg commented Jul 28, 2021

Can I take on fixes to README.md of tutorials? I would like to fix the links in the README.md file (some links are outdated or broken).

One more thing which I would like to suggest is organizing of tutorial links around functionalities of DeepChem in README.md. We have close to 30 tutorials. Maybe we could organize them into categories like introductory tutorials of DeepChem, tutorials for molecular machine learning, quantum chemistry etc. The end result of README.md will look something like this:

/* Introductory content */
Introduction / * This part introduces core functionalities of DeepChem like splitters etc */

  • Basic Tools of Deep Life Sciences
  • ...

Molecular Machine Learning /* Tutorials which are specific to molecular machine learning */

  • Predicting Ki Ligands of proteins
  • ...

Bioinformatics

And so on.

@rbharath
Copy link
Member Author

@arunppsg These are all great suggestions! Both the fixing of links and organization of tutorials would add a lot of value. Please go for it :)

@peastman
Copy link
Contributor

I'd be careful about reordering the tutorials. Especially in the earlier ones, each one assumes you've read the ones that come before.

@ncfrey
Copy link
Contributor

ncfrey commented Jul 30, 2021

One more thing which I would like to suggest is organizing of tutorial links around functionalities of DeepChem in README.md

@arunppsg @rbharath
Just saw this comment, I made a similar one here: #2626 (review)

+1 to having a "core functionality" tutorial series that preserves the sequential ordering (maybe 1 - 11?) and introduce all the core features of deepchem. Then splitting into applications as @arunppsg suggested

@rbharath
Copy link
Member Author

Tentatively I think we can close this issue for now since we've stabilized the core tutorial series. Will re-open if necessary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants