Rewrite the top-level README to be more focused #1081

huonw · 2020-03-15T23:04:56Z

This tweaks our top-level README to try to be a bit more focused and helpful:

rewrite some of the text to be a bit more relevant to data scientists, e.g. reading a bit more naturally by being slightly less precise/exact, and trying to draw the connection to their real data and problems
providing a concrete workflow to get started in two forms:
- a Getting Started section: this includes providing basic Installation instructions, in addition to the detailed ones (users familiar with their environment and Python tooling can start with one of the basic ones, while the expanded section provides more guidance for those who need it)
- a code example (GCN), that tries to emphasise how much of the workflow just builds on the normal ML/TF workflow, and finishes with links to the full GCN demo as well as the other examples
providing a "quick links" for getting help, as well as making sure that section can be found easily with a search (it mentions "help", "support", "contact us")
removing/moving some of the content that's more internal/developer focused

This doesn't touch the algorithms or references sections, since these are somewhat standalone, are good concepts, and can be tweaked as independent work. This also doesn't touch any other READMEs, such as those for the demos.

There's potentially many more tweaks we could do. Let me know if you think any are good.

Rendered form: https://github.com/stellargraph/stellargraph/blob/feature/725-readme/README.md

See: #725

codeclimate · 2020-03-15T23:10:03Z

Code Climate has analyzed commit 905b17d and detected 0 issues on this pull request.

View more on Code Climate.

kjun9

Nice! I like the new way it's organised, with a better focus on getting started and getting help, and removing guiding principles from the top.

Some of my suggestions are based around trying to make the minimal example appear more minimal

README.md

CONTRIBUTING.md

README.md

Co-Authored-By: kevin <33508488+kjun9@users.noreply.github.com>

huonw

Thanks @kjun9, I much prefer the new more minimal one.

README.md

huonw · 2020-03-16T05:08:58Z

I've updated this with a few changes:

added @kjun9's suggestions
tweaked the comments/exact code from those suggestions very slightly
expanded the "StellarGraph supports analysis of ..." sentence into a list, covering more than just the homogeneous vs. heterogeneous split
moved the "The StellarGraph library offers state-of-the-art algorithms ..." section to be first in the Introduction, since it's the important bit, and the discussion of what graph data is is more clarification (I could easily be convinced this isn't an improvement, and some other approach is better 😄)
added more links, including to some of our blog posts

kjun9

Looks great 👍

timpitman · 2020-03-16T23:52:15Z

README.md

@@ -38,90 +38,132 @@
 </p>


-# Table of Contents
+**StellarGraph** is a Python library for machine learning on [graphs and networks](https://en.wikipedia.org/wiki/Graph_\(discrete_mathematics\)).


This is a great concise summary, however the line of text gets a bit lost between the badges and contents.

I'll try moving the badges above the # StellarGraph Machine Learning Library title, and see if that works better.

Oh, I forgot about this. @timpitman what do you think?

I think it looks good, thanks!

README.md

timpitman

Great refactor of the readme, well done! I've found a few minor issues. There's future work in expanding "getting started" that we're already looking at for #992.

README.md

timpitman

excellent!

kieranricardo

looks good!

README.md

habiba-h · 2020-03-23T01:52:59Z

README.md


+Graph-structured data represent entities as nodes (or vertices) and relationships between them as edges (or links), along with associated data as attributes. For example, a graph can contain people as nodes and friendships between them as links, with data like a person's age and the date a friendship was established. StellarGraph supports analysis of many kinds of graphs:


data as attributes

associated with nodes and/or edges. (Didn't sound complete to me without whose attributes we are talking about.)

I don't know how to phrase this without being a bit awkward. I went with

and can data include data associated with either as attributes

but I don't particularly like it. What do you think? Do you have a suggestion?

habiba-h · 2020-03-23T03:13:34Z

README.md


-## Guiding Principles
+- homogeneous (with nodes and links of one type),
+- heterogeneous (with more than one type of nodes and/or links)


maybe a comma after all the bullets and an ', and' on the second last one and a full stop after the last.

I'm in two minds about this style. It does make it read as a sentence, but the hanging and and inconsistent punctuation can be disconcerting.

For instance, https://developers.google.com/tech-writing/one/lists-and-tables doesn't use this style, and the discussion of "keep items parallel" somewhat implies against it.

I don't have a confident opinion about the style but having only bullet point with a comma at the end bothered me :-).
I usually do commas and the comma and when having bullets that are not starting as standalone sentences. I don't know when I started doing that but held on to that convention. I am down with any popular style guide :-).

README.md

habiba-h · 2020-03-23T03:21:23Z

README.md

-  supervised classifier training for the downstream task.
-  - See the demo in folder `demos/node-classification-hinsage` for examples of how to predict attributes of nodes
-  using the HinSAGE algorithm for given node features and training labels.
+## Getting Help


I think Getting Help section would look more appropriate at the end before the references?
Usually Help tab or link is towards the end. Also, the way I see it is that the Getting Help is for everything, code, installation, and all else. So once, all the things are already mentioned here in the readme, if still a user needs here, here are all the ways they can get it.

(I replied to this on your comment on the table of contents)

habiba-h · 2020-03-23T03:36:23Z

README.md

+
+**StellarGraph** is a Python library for machine learning on [graphs and networks](https://en.wikipedia.org/wiki/Graph_\(discrete_mathematics\)).
+
+## Table of Contents


@huonw I have a bit of a different take on the structure and this is only meant as a suggestion, so take whatever makes sense in the end.

Introduction the overarching explanation of what Graph ML is and what StellarGraph does

Algorithms short teaser of all the methods implemented in the library

Installation the structure right now tells a user how to install with demo before installing the library. I think the basic library installation should be upfront. The demo are the optional bit that if a user is interested in testing the functionality of the library before they can install with the demos.

Sample demos here you point out to the demos directory.
4.1 Example The detailed example should follow right after the general introduction to the sample demos. (the example looks much more user friendly to me now :-)). I have some other comments that I will add to the example.

Getting Help _I think the help is for all the things above, installation, understanding of approaches, demos etc. so should come at the end of everything. _

Citing

References

Algorithms short teaser of all the methods implemented in the library

I put this towards the end because I felt it's mainly helpful for someone familiar with graphs and graph ML. For instance, someone who is looking for a specific algorithm. They might either come here through a google search for that algorithm name or searching for graph ML libraries, and then either ctrl-F for the specific algorithm or jump to the section.

For someone unfamiliar with graph ML (or, at least, unfamiliar with the specific algorithms), I don't think the current table of algorithms is very easily digested, and so I'm concerned readers will get distracted before wading through it all.

Installation the structure right now tells a user how to install with demo before installing the library. I think the basic library installation should be upfront. The demo are the optional bit that if a user is interested in testing the functionality of the library before they can install with the demos.

I put this later because it's a bit of a "reference" section. If a user has decided to install StellarGraph, they can easily jump to the relevant section; if they haven't decided to install it, it's mostly distraction. For the latter, one thing it is useful for might be to emphasise how easy it is to install, so I've added:

It is thus also [easy to install with `pip` or Anaconda](#Installation).

to the end of the last paragraph of the introduction.

Sample demos here you point out to the demos directory.
4.1 Example The detailed example should follow right after the general introduction to the sample demos. (the example looks much more user friendly to me now :-)). I have some other comments that I will add to the example.

To be specific, you think it's better to have a "Sample demos" section than a "Getting started" one? I feel that the "getting started" phrasing is more obvious as a place for users to look: it is talking about what they want to do (start using StellarGraph), rather than the route we're suggesting they use (look at demos).

Getting Help _I think the help is for all the things above, installation, understanding of approaches, demos etc. so should come at the end of everything. _

As touched upon my some of my earlier comments, I don't see this as a linear document because there's some sections that many uses will not look at (because they don't need to, for their particular use-case/background, e.g. if pip install stellargraph is fine for a user, they don't need to look at the Anaconda or docker installation instructions). In addition, I think we should be emphasising the getting help section: we are not getting too many issues at the moment, so we should be happy to help anyone with any problem related to StellarGraph. It's better for someone to ask a question that has already been answered (but they didn't find) than to not ask and give up using StellarGraph.

habiba-h · 2020-03-23T03:57:46Z

README.md


+# convert the raw data into StellarGraph's graph format for faster operations
+graph = sg.StellarGraph(nodes, edges)
+


I think this part should be split. Creating StellarGraph is an independent task than the Generator and other model specific tasks.

I think in theory it is, but I put this here for 3 reasons:

three has a reputation as being a good number for writing https://en.wikipedia.org/wiki/Rule_of_three_(writing)

put all of the StellarGraph stuff together so that we can say that the first and third sections are just normal data science, to emphasise that graph ML is not a new workflow

I couldn't think of a third reason, but 3 is nice 😄

three has a reputation as being a good number for writing https://en.wikipedia.org/wiki/Rule_of_three_(writing)

I didn't know about the rule of three. Learnt something new :-D.

README.md

huonw

Thanks for the thoughtful review @habiba-h, I've made some updates based on some, and replied with some questions for the others.

huonw · 2020-03-23T04:52:11Z

README.md

@@ -38,90 +38,132 @@
 </p>


-# Table of Contents
+**StellarGraph** is a Python library for machine learning on [graphs and networks](https://en.wikipedia.org/wiki/Graph_\(discrete_mathematics\)).


Oh, I forgot about this. @timpitman what do you think?

huonw · 2020-03-23T04:55:42Z

README.md


+Graph-structured data represent entities as nodes (or vertices) and relationships between them as edges (or links), along with associated data as attributes. For example, a graph can contain people as nodes and friendships between them as links, with data like a person's age and the date a friendship was established. StellarGraph supports analysis of many kinds of graphs:


I don't know how to phrase this without being a bit awkward. I went with

and can data include data associated with either as attributes

but I don't particularly like it. What do you think? Do you have a suggestion?

huonw · 2020-03-23T04:59:31Z

README.md


-## Guiding Principles
+- homogeneous (with nodes and links of one type),
+- heterogeneous (with more than one type of nodes and/or links)


I'm in two minds about this style. It does make it read as a sentence, but the hanging and and inconsistent punctuation can be disconcerting.

For instance, https://developers.google.com/tech-writing/one/lists-and-tables doesn't use this style, and the discussion of "keep items parallel" somewhat implies against it.

huonw · 2020-03-23T05:43:47Z

README.md

+from sklearn import model_selection
+
+# load data into Pandas DataFrames, e.g. from CSV files or a database
+nodes, edges, targets = load_my_data()


No, there's no load_my_data function in StellarGraph. We previously spelled this out in more detail, e.g. https://github.com/stellargraph/stellargraph/blob/1aa1188fbc0ea5ae7f65f1861dc5e8f31f44f3e9/README.md#example-gcn shows the pre-review version of the README, but we cut it out because it allowed reducing the amount of code significantly (see #1081 (comment)).

As before, this example is focusing on giving the high-level taster for StellarGraph; the demo notebook seems like a better place to look for copy/pasteable code, and so I think the trade-off of having an implied "write your own code" section is ok. You make a good point that it's a little ambiguous, so I've added a stub function:

import pandas as pd from sklearn import model_selection def load_my_data(): # your own code to load data into Pandas DataFrames, e.g. from CSV files or a database ... nodes, edges, targets = load_my_data() # Use scikit-learn to compute training and test sets train_targets, test_targets = model_selection.train_test_split(targets, train_size=0.5)

huonw · 2020-03-23T05:53:38Z

README.md


+# convert the raw data into StellarGraph's graph format for faster operations
+graph = sg.StellarGraph(nodes, edges)
+


I think in theory it is, but I put this here for 3 reasons:

three has a reputation as being a good number for writing https://en.wikipedia.org/wiki/Rule_of_three_(writing)

put all of the StellarGraph stuff together so that we can say that the first and third sections are just normal data science, to emphasise that graph ML is not a new workflow

I couldn't think of a third reason, but 3 is nice 😄

huonw · 2020-03-23T06:19:07Z

README.md

+
+**StellarGraph** is a Python library for machine learning on [graphs and networks](https://en.wikipedia.org/wiki/Graph_\(discrete_mathematics\)).
+
+## Table of Contents


Algorithms short teaser of all the methods implemented in the library

I put this towards the end because I felt it's mainly helpful for someone familiar with graphs and graph ML. For instance, someone who is looking for a specific algorithm. They might either come here through a google search for that algorithm name or searching for graph ML libraries, and then either ctrl-F for the specific algorithm or jump to the section.

For someone unfamiliar with graph ML (or, at least, unfamiliar with the specific algorithms), I don't think the current table of algorithms is very easily digested, and so I'm concerned readers will get distracted before wading through it all.

Installation the structure right now tells a user how to install with demo before installing the library. I think the basic library installation should be upfront. The demo are the optional bit that if a user is interested in testing the functionality of the library before they can install with the demos.

I put this later because it's a bit of a "reference" section. If a user has decided to install StellarGraph, they can easily jump to the relevant section; if they haven't decided to install it, it's mostly distraction. For the latter, one thing it is useful for might be to emphasise how easy it is to install, so I've added:

It is thus also [easy to install with `pip` or Anaconda](#Installation).

to the end of the last paragraph of the introduction.

Sample demos here you point out to the demos directory.
4.1 Example The detailed example should follow right after the general introduction to the sample demos. (the example looks much more user friendly to me now :-)). I have some other comments that I will add to the example.

To be specific, you think it's better to have a "Sample demos" section than a "Getting started" one? I feel that the "getting started" phrasing is more obvious as a place for users to look: it is talking about what they want to do (start using StellarGraph), rather than the route we're suggesting they use (look at demos).

Getting Help _I think the help is for all the things above, installation, understanding of approaches, demos etc. so should come at the end of everything. _

As touched upon my some of my earlier comments, I don't see this as a linear document because there's some sections that many uses will not look at (because they don't need to, for their particular use-case/background, e.g. if pip install stellargraph is fine for a user, they don't need to look at the Anaconda or docker installation instructions). In addition, I think we should be emphasising the getting help section: we are not getting too many issues at the moment, so we should be happy to help anyone with any problem related to StellarGraph. It's better for someone to ask a question that has already been answered (but they didn't find) than to not ask and give up using StellarGraph.

README.md

huonw · 2020-03-23T06:24:53Z

README.md

-  supervised classifier training for the downstream task.
-  - See the demo in folder `demos/node-classification-hinsage` for examples of how to predict attributes of nodes
-  using the HinSAGE algorithm for given node features and training labels.
+## Getting Help


(I replied to this on your comment on the table of contents)

README.md

timpitman

The addition of our email contact to the README looks good. This did get me thinking about CONTRIBUTING.md though - should it also have the contact details (or a link back to the README?), and also the README itself doesn't link to CONTRIBUTING.md.

huonw · 2020-03-24T07:38:10Z

Thanks for the reviews everyone, I think it's now much better than my first version! 😄

huonw added 2 commits March 13, 2020 17:35

Move CI section

2a0212d

Rewrite the readme to be more focused

1aa1188

huonw added 4 commits March 16, 2020 10:13

Tweaks

1297037

Tweak and move example

17fb553

fix ToC

290bb6b

Add help and support keywords

b25efa2

huonw marked this pull request as ready for review March 16, 2020 01:26

huonw requested review from kieranricardo, timpitman and PantelisElinas March 16, 2020 01:26

kjun9 reviewed Mar 16, 2020

View reviewed changes

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

CONTRIBUTING.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

PantelisElinas approved these changes Mar 16, 2020

View reviewed changes

Apply suggestions from code review

2732106

Co-Authored-By: kevin <33508488+kjun9@users.noreply.github.com>

huonw commented Mar 16, 2020

View reviewed changes

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

huonw added 5 commits March 16, 2020 15:55

Remove .

82b9270

Minor tweaks from code review

c1ba068

Expand "supports analysis of" section to a list

4ae6abe

Move introduction

f54b7d5

Add more links

95b4cf8

huonw requested a review from kjun9 March 16, 2020 05:09

kjun9 approved these changes Mar 16, 2020

View reviewed changes

timpitman reviewed Mar 16, 2020

View reviewed changes

README.md Outdated Show resolved Hide resolved

timpitman reviewed Mar 16, 2020

View reviewed changes

README.md Outdated Show resolved Hide resolved

timpitman suggested changes Mar 17, 2020

View reviewed changes

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Show resolved Hide resolved

tweaks from Tim's review

7aa1d86

huonw requested a review from timpitman March 17, 2020 03:37

timpitman approved these changes Mar 17, 2020

View reviewed changes

huonw requested a review from habiba-h March 18, 2020 03:18

kieranricardo approved these changes Mar 18, 2020

View reviewed changes

habiba-h reviewed Mar 18, 2020

View reviewed changes

README.md Show resolved Hide resolved

README.md Show resolved Hide resolved

huonw added 3 commits March 18, 2020 16:56

Add link to hateful twitter blog post

e3af6c8

Add graph classification

08ac2c9

Rewrite example to use more normal text

deb809d

huonw requested a review from habiba-h March 19, 2020 04:26

kjun9 mentioned this pull request Mar 19, 2020

Replace quickstart.txt with readme #1096

Merged

habiba-h suggested changes Mar 23, 2020

View reviewed changes

Updates from Habiba's review

7acaae8

huonw commented Mar 23, 2020

View reviewed changes

huonw added 4 commits March 23, 2020 17:27

Move tensorflow import

55cd247

Add email

c12459d

add subject

6ef23ea

Move model = line in the example

c451a57

huonw requested review from habiba-h and timpitman March 23, 2020 23:59

timpitman approved these changes Mar 24, 2020

View reviewed changes

Fix graph link

ea02cc4

habiba-h approved these changes Mar 24, 2020

View reviewed changes

Merge remote-tracking branch 'origin/develop' into feature/725-readme

905b17d

huonw merged commit 9ef9a1f into develop Mar 24, 2020

huonw deleted the feature/725-readme branch March 31, 2020 03:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite the top-level README to be more focused #1081

Rewrite the top-level README to be more focused #1081

huonw commented Mar 15, 2020 •

edited

codeclimate bot commented Mar 15, 2020 •

edited

kjun9 left a comment

huonw left a comment

huonw commented Mar 16, 2020

kjun9 left a comment

timpitman Mar 16, 2020

huonw Mar 17, 2020

huonw Mar 23, 2020

timpitman Mar 23, 2020

timpitman left a comment

timpitman left a comment

kieranricardo left a comment

habiba-h Mar 23, 2020 •

edited

huonw Mar 23, 2020

habiba-h Mar 23, 2020

huonw Mar 23, 2020

habiba-h Mar 23, 2020

habiba-h Mar 23, 2020

huonw Mar 23, 2020

habiba-h Mar 23, 2020

huonw Mar 23, 2020

habiba-h Mar 23, 2020

huonw Mar 23, 2020

habiba-h Mar 23, 2020

huonw left a comment

huonw Mar 23, 2020

huonw Mar 23, 2020

huonw Mar 23, 2020

huonw Mar 23, 2020

huonw Mar 23, 2020

huonw Mar 23, 2020

huonw Mar 23, 2020

timpitman left a comment •

edited

huonw commented Mar 24, 2020


		Graph-structured data represent entities as nodes (or vertices) and relationships between them as edges (or links), along with associated data as attributes. For example, a graph can contain people as nodes and friendships between them as links, with data like a person's age and the date a friendship was established. StellarGraph supports analysis of many kinds of graphs:


		StellarGraph is a Python library for machine learning on [graphs and networks](https://en.wikipedia.org/wiki/Graph_\(discrete_mathematics\)).

		## Table of Contents


		# convert the raw data into StellarGraph's graph format for faster operations
		graph = sg.StellarGraph(nodes, edges)

Rewrite the top-level README to be more focused #1081

Rewrite the top-level README to be more focused #1081

Conversation

huonw commented Mar 15, 2020 • edited

codeclimate bot commented Mar 15, 2020 • edited

kjun9 left a comment

Choose a reason for hiding this comment

huonw left a comment

Choose a reason for hiding this comment

huonw commented Mar 16, 2020

kjun9 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timpitman left a comment

Choose a reason for hiding this comment

timpitman left a comment

Choose a reason for hiding this comment

kieranricardo left a comment

Choose a reason for hiding this comment

habiba-h Mar 23, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huonw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timpitman left a comment • edited

Choose a reason for hiding this comment

huonw commented Mar 24, 2020

huonw commented Mar 15, 2020 •

edited

codeclimate bot commented Mar 15, 2020 •

edited

habiba-h Mar 23, 2020 •

edited

timpitman left a comment •

edited