New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite the top-level README to be more focused #1081
Conversation
Code Climate has analyzed commit 905b17d and detected 0 issues on this pull request. View more on Code Climate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I like the new way it's organised, with a better focus on getting started and getting help, and removing guiding principles from the top.
Some of my suggestions are based around trying to make the minimal example appear more minimal
Co-Authored-By: kevin <33508488+kjun9@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kjun9, I much prefer the new more minimal one.
I've updated this with a few changes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great 👍
README.md
Outdated
@@ -38,90 +38,132 @@ | |||
</p> | |||
|
|||
|
|||
# Table of Contents | |||
**StellarGraph** is a Python library for machine learning on [graphs and networks](https://en.wikipedia.org/wiki/Graph_\(discrete_mathematics\)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great concise summary, however the line of text gets a bit lost between the badges and contents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try moving the badges above the # StellarGraph Machine Learning Library
title, and see if that works better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I forgot about this. @timpitman what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it looks good, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great refactor of the readme, well done! I've found a few minor issues. There's future work in expanding "getting started" that we're already looking at for #992.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excellent!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good!
README.md
Outdated
|
||
Graph-structured data represent entities as nodes (or vertices) and relationships between them as edges (or links), along with associated data as attributes. For example, a graph can contain people as nodes and friendships between them as links, with data like a person's age and the date a friendship was established. StellarGraph supports analysis of many kinds of graphs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data as attributes
associated with nodes and/or edges. (Didn't sound complete to me without whose attributes we are talking about.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how to phrase this without being a bit awkward. I went with
and can data include data associated with either as attributes
but I don't particularly like it. What do you think? Do you have a suggestion?
|
||
## Guiding Principles | ||
- homogeneous (with nodes and links of one type), | ||
- heterogeneous (with more than one type of nodes and/or links) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a comma after all the bullets and an ', and' on the second last one and a full stop after the last.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in two minds about this style. It does make it read as a sentence, but the hanging and
and inconsistent punctuation can be disconcerting.
For instance, https://developers.google.com/tech-writing/one/lists-and-tables doesn't use this style, and the discussion of "keep items parallel" somewhat implies against it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a confident opinion about the style but having only bullet point with a comma at the end bothered me :-).
I usually do commas and the comma and when having bullets that are not starting as standalone sentences. I don't know when I started doing that but held on to that convention. I am down with any popular style guide :-).
supervised classifier training for the downstream task. | ||
- See the demo in folder `demos/node-classification-hinsage` for examples of how to predict attributes of nodes | ||
using the HinSAGE algorithm for given node features and training labels. | ||
## Getting Help |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Getting Help section would look more appropriate at the end before the references?
Usually Help tab or link is towards the end. Also, the way I see it is that the Getting Help is for everything, code, installation, and all else. So once, all the things are already mentioned here in the readme, if still a user needs here, here are all the ways they can get it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I replied to this on your comment on the table of contents)
|
||
**StellarGraph** is a Python library for machine learning on [graphs and networks](https://en.wikipedia.org/wiki/Graph_\(discrete_mathematics\)). | ||
|
||
## Table of Contents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huonw I have a bit of a different take on the structure and this is only meant as a suggestion, so take whatever makes sense in the end.
- Introduction the overarching explanation of what Graph ML is and what StellarGraph does
- Algorithms short teaser of all the methods implemented in the library
- Installation the structure right now tells a user how to install with demo before installing the library. I think the basic library installation should be upfront. The demo are the optional bit that if a user is interested in testing the functionality of the library before they can install with the demos.
- Sample demos here you point out to the demos directory.
4.1 Example The detailed example should follow right after the general introduction to the sample demos. (the example looks much more user friendly to me now :-)). I have some other comments that I will add to the example. - Getting Help _I think the help is for all the things above, installation, understanding of approaches, demos etc. so should come at the end of everything. _
- Citing
- References
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Algorithms short teaser of all the methods implemented in the library
I put this towards the end because I felt it's mainly helpful for someone familiar with graphs and graph ML. For instance, someone who is looking for a specific algorithm. They might either come here through a google search for that algorithm name or searching for graph ML libraries, and then either ctrl-F for the specific algorithm or jump to the section.
For someone unfamiliar with graph ML (or, at least, unfamiliar with the specific algorithms), I don't think the current table of algorithms is very easily digested, and so I'm concerned readers will get distracted before wading through it all.
Installation the structure right now tells a user how to install with demo before installing the library. I think the basic library installation should be upfront. The demo are the optional bit that if a user is interested in testing the functionality of the library before they can install with the demos.
I put this later because it's a bit of a "reference" section. If a user has decided to install StellarGraph, they can easily jump to the relevant section; if they haven't decided to install it, it's mostly distraction. For the latter, one thing it is useful for might be to emphasise how easy it is to install, so I've added:
It is thus also [easy to install with `pip` or Anaconda](#Installation).
to the end of the last paragraph of the introduction.
Sample demos here you point out to the demos directory.
4.1 Example The detailed example should follow right after the general introduction to the sample demos. (the example looks much more user friendly to me now :-)). I have some other comments that I will add to the example.
To be specific, you think it's better to have a "Sample demos" section than a "Getting started" one? I feel that the "getting started" phrasing is more obvious as a place for users to look: it is talking about what they want to do (start using StellarGraph), rather than the route we're suggesting they use (look at demos).
Getting Help _I think the help is for all the things above, installation, understanding of approaches, demos etc. so should come at the end of everything. _
As touched upon my some of my earlier comments, I don't see this as a linear document because there's some sections that many uses will not look at (because they don't need to, for their particular use-case/background, e.g. if pip install stellargraph
is fine for a user, they don't need to look at the Anaconda or docker installation instructions). In addition, I think we should be emphasising the getting help section: we are not getting too many issues at the moment, so we should be happy to help anyone with any problem related to StellarGraph. It's better for someone to ask a question that has already been answered (but they didn't find) than to not ask and give up using StellarGraph.
|
||
# convert the raw data into StellarGraph's graph format for faster operations | ||
graph = sg.StellarGraph(nodes, edges) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this part should be split. Creating StellarGraph is an independent task than the Generator
and other model specific tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in theory it is, but I put this here for 3 reasons:
- three has a reputation as being a good number for writing https://en.wikipedia.org/wiki/Rule_of_three_(writing)
- put all of the StellarGraph stuff together so that we can say that the first and third sections are just normal data science, to emphasise that graph ML is not a new workflow
- I couldn't think of a third reason, but 3 is nice 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- three has a reputation as being a good number for writing https://en.wikipedia.org/wiki/Rule_of_three_(writing)
I didn't know about the rule of three. Learnt something new :-D.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the thoughtful review @habiba-h, I've made some updates based on some, and replied with some questions for the others.
README.md
Outdated
@@ -38,90 +38,132 @@ | |||
</p> | |||
|
|||
|
|||
# Table of Contents | |||
**StellarGraph** is a Python library for machine learning on [graphs and networks](https://en.wikipedia.org/wiki/Graph_\(discrete_mathematics\)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I forgot about this. @timpitman what do you think?
README.md
Outdated
|
||
Graph-structured data represent entities as nodes (or vertices) and relationships between them as edges (or links), along with associated data as attributes. For example, a graph can contain people as nodes and friendships between them as links, with data like a person's age and the date a friendship was established. StellarGraph supports analysis of many kinds of graphs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how to phrase this without being a bit awkward. I went with
and can data include data associated with either as attributes
but I don't particularly like it. What do you think? Do you have a suggestion?
|
||
## Guiding Principles | ||
- homogeneous (with nodes and links of one type), | ||
- heterogeneous (with more than one type of nodes and/or links) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in two minds about this style. It does make it read as a sentence, but the hanging and
and inconsistent punctuation can be disconcerting.
For instance, https://developers.google.com/tech-writing/one/lists-and-tables doesn't use this style, and the discussion of "keep items parallel" somewhat implies against it.
from sklearn import model_selection | ||
|
||
# load data into Pandas DataFrames, e.g. from CSV files or a database | ||
nodes, edges, targets = load_my_data() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there's no load_my_data
function in StellarGraph. We previously spelled this out in more detail, e.g. https://github.com/stellargraph/stellargraph/blob/1aa1188fbc0ea5ae7f65f1861dc5e8f31f44f3e9/README.md#example-gcn shows the pre-review version of the README, but we cut it out because it allowed reducing the amount of code significantly (see #1081 (comment)).
As before, this example is focusing on giving the high-level taster for StellarGraph; the demo notebook seems like a better place to look for copy/pasteable code, and so I think the trade-off of having an implied "write your own code" section is ok. You make a good point that it's a little ambiguous, so I've added a stub function:
import pandas as pd
from sklearn import model_selection
def load_my_data():
# your own code to load data into Pandas DataFrames, e.g. from CSV files or a database
...
nodes, edges, targets = load_my_data()
# Use scikit-learn to compute training and test sets
train_targets, test_targets = model_selection.train_test_split(targets, train_size=0.5)
|
||
# convert the raw data into StellarGraph's graph format for faster operations | ||
graph = sg.StellarGraph(nodes, edges) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in theory it is, but I put this here for 3 reasons:
- three has a reputation as being a good number for writing https://en.wikipedia.org/wiki/Rule_of_three_(writing)
- put all of the StellarGraph stuff together so that we can say that the first and third sections are just normal data science, to emphasise that graph ML is not a new workflow
- I couldn't think of a third reason, but 3 is nice 😄
|
||
**StellarGraph** is a Python library for machine learning on [graphs and networks](https://en.wikipedia.org/wiki/Graph_\(discrete_mathematics\)). | ||
|
||
## Table of Contents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Algorithms short teaser of all the methods implemented in the library
I put this towards the end because I felt it's mainly helpful for someone familiar with graphs and graph ML. For instance, someone who is looking for a specific algorithm. They might either come here through a google search for that algorithm name or searching for graph ML libraries, and then either ctrl-F for the specific algorithm or jump to the section.
For someone unfamiliar with graph ML (or, at least, unfamiliar with the specific algorithms), I don't think the current table of algorithms is very easily digested, and so I'm concerned readers will get distracted before wading through it all.
Installation the structure right now tells a user how to install with demo before installing the library. I think the basic library installation should be upfront. The demo are the optional bit that if a user is interested in testing the functionality of the library before they can install with the demos.
I put this later because it's a bit of a "reference" section. If a user has decided to install StellarGraph, they can easily jump to the relevant section; if they haven't decided to install it, it's mostly distraction. For the latter, one thing it is useful for might be to emphasise how easy it is to install, so I've added:
It is thus also [easy to install with `pip` or Anaconda](#Installation).
to the end of the last paragraph of the introduction.
Sample demos here you point out to the demos directory.
4.1 Example The detailed example should follow right after the general introduction to the sample demos. (the example looks much more user friendly to me now :-)). I have some other comments that I will add to the example.
To be specific, you think it's better to have a "Sample demos" section than a "Getting started" one? I feel that the "getting started" phrasing is more obvious as a place for users to look: it is talking about what they want to do (start using StellarGraph), rather than the route we're suggesting they use (look at demos).
Getting Help _I think the help is for all the things above, installation, understanding of approaches, demos etc. so should come at the end of everything. _
As touched upon my some of my earlier comments, I don't see this as a linear document because there's some sections that many uses will not look at (because they don't need to, for their particular use-case/background, e.g. if pip install stellargraph
is fine for a user, they don't need to look at the Anaconda or docker installation instructions). In addition, I think we should be emphasising the getting help section: we are not getting too many issues at the moment, so we should be happy to help anyone with any problem related to StellarGraph. It's better for someone to ask a question that has already been answered (but they didn't find) than to not ask and give up using StellarGraph.
supervised classifier training for the downstream task. | ||
- See the demo in folder `demos/node-classification-hinsage` for examples of how to predict attributes of nodes | ||
using the HinSAGE algorithm for given node features and training labels. | ||
## Getting Help |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I replied to this on your comment on the table of contents)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition of our email contact to the README looks good. This did get me thinking about CONTRIBUTING.md though - should it also have the contact details (or a link back to the README?), and also the README itself doesn't link to CONTRIBUTING.md.
Thanks for the reviews everyone, I think it's now much better than my first version! 😄 |
This tweaks our top-level README to try to be a bit more focused and helpful:
This doesn't touch the algorithms or references sections, since these are somewhat standalone, are good concepts, and can be tweaked as independent work. This also doesn't touch any other READMEs, such as those for the demos.
There's potentially many more tweaks we could do. Let me know if you think any are good.
Rendered form: https://github.com/stellargraph/stellargraph/blob/feature/725-readme/README.md
See: #725