Skip to content
This repository has been archived by the owner on Feb 21, 2024. It is now read-only.

Update "Getting Started" documentation #2028

Merged
merged 19 commits into from
Jul 3, 2019
Merged

Conversation

asvetlik
Copy link
Contributor

@asvetlik asvetlik commented Jun 28, 2019

Overview

This will redo the current Getting Started on the Pilosa website. It is adding Go, Java, and Python and removing Docker.

Pull request checklist

  • I have read the contributing guide.
  • I have agreed to the Contributor License Agreement.
  • I have updated the documentation.
  • I have resolved any merge conflicts.
  • I have included tests that cover my changes.
  • All new and existing tests pass.
  • Make sure PR title conforms to convention in CHANGELOG.md.
  • Add appropriate changelog label to PR (if applicable).

Code review checklist

This is the checklist that the reviewer will follow while reviewing your pull request. You do not need to do anything with this checklist, but be aware of what the reviewer will be looking for.

  • Ensure that any changes to external docs have been included in this pull request.
  • If the changes require that minor/major versions need to be updated, tag the PR appropriately.
  • Ensure the new code is properly commented and follows Idiomatic Go.
  • Check that tests have been written and that they cover the new functionality.
  • Run tests and ensure they pass.
  • Build and run the code, performing any applicable integration testing.
  • Make sure PR title conforms to convention in CHANGELOG.md.
  • Make sure PR is tagged with appropriate changelog label.

@asvetlik asvetlik requested a review from jaffee June 28, 2019 16:48
@asvetlik
Copy link
Contributor Author

Under the "Using Java" section and under the "Creating the Environment" subsection, you have to make some edits to the pom.xml file. I want to bold or change the color of the specific changes that need to be made, but I don't know how. Also, should I just change the pom.xml file that is in the official getting-started repository and remove the whole need to edit in the first place?

@asvetlik asvetlik closed this Jun 28, 2019
@asvetlik asvetlik reopened this Jun 28, 2019
@asvetlik
Copy link
Contributor Author

I realized why I didn't just edit the pom.xml file and make a pull request. I can do a pull request and update the version, but the StarTrace.py file in the Getting Started is different than the startrace.py file in the getting-started repository and the way the mainClass is called is different for each.

"What's Next?",
]
+++

## Getting Started

Pilosa supports an HTTP interface which uses JSON by default.
Any HTTP tool can be used to interact with the Pilosa server. The examples in this documentation will use [curl](https://curl.haxx.se/) which is available by default on many UNIX-like systems including Linux and MacOS. Windows users can download curl [here](https://curl.haxx.se/download.html).
Any HTTP tool can be used to interact with the Pilosa server. The examples in this documentation will use curl which is available by default on many UNIX-like systems including Linux and MacOS. However, the best way to interface with the Pilosa server is through one of our three client libraries. Pilosa currently supports [Go](https://github.com/pilosa/go-pilosa), [Java](https://github.com/pilosa/java-pilosa), and [Python](https://github.com/pilosa/python-pilosa).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"one of our three official client libraries" - we do have others, but we focus on these three because they're full-featured, and we keep them up to date.


#### Create the Schema
Pilosa supports curl (or any HTTP tool), Go, Java, and Python. In this project, we will walk you through how to use each one to best communicate with the Pilosa server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to continue with some pedantic comments for a while: The wording is a bit off here - Pilosa the organization supports the development of the go-pilosa, java-pilosa and python-pilosa client libraries, but Pilosa the server supports any client that can send requests to it.

@@ -56,14 +55,20 @@ curl localhost:10101/schema
{"indexes":null}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update this to show a non-null response - maybe a response after importing is complete, and a note explaining that.


##### Create the Environment

In order to communicate with Pilosa through your Go code, you must have a "translator," which is go-pilosa. To install go-pilosa, open a terminal (one other than the one running Pilosa) and download the library in your `GOPATH` using:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't normally use the term "translator" for this, and it does have another meaning in the context of Pilosa - "client" would be a better term.


To contain the Getting Started project in one place, we will create a new folder as follows:
```
mkdir GettingStarted && cd GettingStarted
Copy link
Contributor

@alanbernstein alanbernstein Jun 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest getting-started for consistency with the repo. (also because I just prefer lower case filenames)

curl -O https://raw.githubusercontent.com/pilosa/getting-started/master/language.csv
```

We will also create a file called `StarTrace.go` as follows:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest startrace.go for consistency, plus go naming conventions.

```
stargazer = repository.field("stargazer", time_quantum=pilosa.TimeQuantum.YEAR_MONTH_DAY)
```
Since our data contains time stamps which represent the time users starred repos, we establish the time aspect by using `time_quantum`. Time quantum is the resolution of the time we want to use, and we set it to `YEAR_MONTH-DAY` for `stargazer`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YEAR_MONTH-DAY looks like a typo

@@ -232,7 +224,682 @@ curl localhost:10101/index/repository/query \
Please note that while user ID 99999 may not be sequential with the other column IDs, it is still a relatively low number.
Don't try to use arbitrary 64-bit integers as column or row IDs in Pilosa - this will lead to problems such as poor performance and out of memory errors.

#### Using Go

Pilosa requires Go 1.12 or higher. It is also recommended that you have a code editor downloaded.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be personal preference, but I think we can leave out the "code editor" recommendation.


##### Create the Environment

In order to communicate with Pilosa through your Go code, you must have a "translator," which is go-pilosa. To install go-pilosa, open a terminal (one other than the one running Pilosa) and download the library in your `GOPATH` using:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"download the library to your GOPATH"


##### Create the Schema

Before we can import data or run queries, we need to create our schema. Go-pilosa is implemented by importing `github.com/pilosa/go-pilosa` and its ability to read csv files is implemented by importing 'github.com/pilosa/go-pilosa/csv`. The first steps to creating the schema are creating a client which will communicate our schema to Pilosa, creating a schema which will contain our indexes and fields, and syncing with Pilosa. This is all done in the `StarTrace.go` file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"implemented" is not quite the right term to use here. I'd suggest something like

"You can see two imports from the go-pilosa repo, go-pilosa for the client, and csv for the CSV reader."


##### Create the Schema

Before we can import data or run queries, we need to create our schema. Go-pilosa is implemented by importing `github.com/pilosa/go-pilosa` and its ability to read csv files is implemented by importing 'github.com/pilosa/go-pilosa/csv`. The first steps to creating the schema are creating a client which will communicate our schema to Pilosa, creating a schema which will contain our indexes and fields, and syncing with Pilosa. This is all done in the `StarTrace.go` file:
Copy link
Contributor

@alanbernstein alanbernstein Jun 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those three steps are all of the steps in creating the schema - I'd phrase this as something like "Create the schema on the Pilosa server by first creating a client, defining the schema locally, then syncing with Pilosa".

log.Fatal(err)
}
```
Since our `stargazer` data contains time stamps, which represent the time users starred repos, we will be using the `csv.NewColumnIteratorWithTimeStampFormat` function that is built into the go-pilosa import. This function takes the format of the csv files (`csv.RowIDColumnID`), an `io.Reader` (`bytes.NewReader(stargazerFile)`), and the time quantum format (`format`) and translates the csv file into a format Pilosa can read. Time quantum is the resolution of the time we want to use.
Copy link
Contributor

@alanbernstein alanbernstein Jun 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of "that is built into the go-pilosa import", we can say "from the go-pilosa/csv package"


##### Create the Environment

In order to communicate with Pilosa through your Go code, you must have a client, which is go-pilosa. To install go-pilosa, open a terminal (one other than the one running Pilosa) and download the library to your `GOPATH` using:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interacting with Pilosa in your go program is best accomplished using our client, go-pilosa. Install go-pilosa (in a new terminal) and download ...

go get github.com/pilosa/go-pilosa
```

To contain the Getting Started project in one place, we will create a new folder as follows:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a project folder:


We will now create the java directory that will contain our `startrace.java` file and create the `startrace.java` file:
```
mkdir src && cd src
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mkdir -p src/main/java && cd src/main/java
touch startrace.go

@@ -227,18 +266,18 @@ Don't try to use arbitrary 64-bit integers as column or row IDs in Pilosa - this

#### Using Go

Pilosa requires Go 1.12 or higher.
Pilosa supports the two most recent versions of Go.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Pilosa follows the Go policy of supporting the two most recent major versions of Go."

FYI, as explained here: https://golang.org/doc/devel/release.html. The "major" is meaningful here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhhhhhh

import com.pilosa.client.TimeQuantum;
```
Create the schema by creating a client which will communicate our schema to Pilosa, creating a schema which will contain our indexes and fields, and syncing with Pilosa. This is all done in the `startrace.java` file:
Before we can import data or run queries, we need to create our schema. The first 6 dependencies are imported from the java-pilosa library. Create the schema by creating a client which will communicate our schema to Pilosa, creating a schema which will contain our indexes and fields, and syncing with Pilosa. This is all done in the `StarTrace.java` file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"6" should be "six". This is one of those silly style rules that I don't really believe in, but follow compulsively.

This paragraph can also be changed to match the update in the corresponding part of the Go section (line 297)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to be "You can see the first six dependencies are imported from the java-pilosa library." I also made the same change to the Python section

@@ -44,17 +44,48 @@ In order to better understand Pilosa's capabilities, we will create a sample pro

Although Pilosa doesn't keep the data in a tabular format, we still use the terms "columns" and "rows" when describing the data model. We put the primary objects in columns, and the properties of those objects in rows. For example, the Star Trace project will contain an index called "repository" which contains columns representing Github repositories, and rows representing properties like programming languages and stargazers. We can better organize the rows by grouping them into sets called Fields. So the "repository" index might have a "languages" field as well as a "stargazers" field. You can learn more about indexes and fields in the [Data Model](../data-model/) section of the documentation.

Pilosa as an organization supports curl (or any HTTP tool), Go, Java, and Python. However, Pilosa as a server will support any client that can send requests to it. In this project, we will walk you through how to use each one to best communicate with the Pilosa server.
Pilosa officially supports curl (or any HTTP tool), Go, Java, and Python, however it will accept any client that can send requests to it. In this project, we will walk you through how to use each one to best communicate with the Pilosa server.
Copy link
Contributor

@alanbernstein alanbernstein Jul 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording is still a bit off here - Pilosa doesn't really know anything about curl specifically.

"Pilosa officially supports three client libraries, for Go, Java and Python. You can also use any HTTP client, such as curl, for quick testing, but official client libraries are the preferred method in production code."

I'd say after making this change, the note below ("Note: This is not the recommended way to interact with Pilosa, but it is the fastest way to see the efficiency of Pilosa.") is no longer needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was that wording book you were referencing again? I may need to read it 😅

"What's Next?",
]
+++

## Getting Started

Pilosa supports an HTTP interface which uses JSON by default.
Any HTTP tool can be used to interact with the Pilosa server. The examples in this documentation will use [curl](https://curl.haxx.se/) which is available by default on many UNIX-like systems including Linux and MacOS. Windows users can download curl [here](https://curl.haxx.se/download.html).
Any HTTP tool can be used to interact with the Pilosa server. The examples in this documentation will use curl which is available by default on many UNIX-like systems including Linux and MacOS. However, the best way to interface with the Pilosa server is through one of our three official client libraries. Pilosa currently supports [Go](https://github.com/pilosa/go-pilosa), [Java](https://github.com/pilosa/java-pilosa), and [Python](https://github.com/pilosa/python-pilosa).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized this paragraph, and the one at line 47, are kind of redundant. Is there a reason for repeating this information?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a paragraph from the original Getting Started file that I edited to fit what we were doing. I don't think we actually need it. I can remove it and then take the Open File Limit comment out of the flag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove one of them, but it seems like a good summary intro for the whole page, so I kind of like it at the beginning.

I don't think the note about the open file limit needs to be changed.

```
Note: This is the response you should receive once completing this project. It has also been formatted using `jq`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a "Note" section both before and after the code sample above. Let's combine these into one section, and put it above the code sample. We can also give it special formatting - see line 20 for an example.

If at any time you want to verify the data structure, you can request the schema as follows:
<div class="note">
<p>If at any time you want to verify the data structure, you can request the schema as follows:<\p>
<\div>
Copy link
Contributor

@alanbernstein alanbernstein Jul 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a forward slash, not a backslash (also the <\p> tag). Backslashes are used mostly as "escape characters", it is usually good to avoid them unless you know they are required for something.

```
<div class="note">
<p>Note: This is the response you should receive once completing this project. It has also been formatted using `jq`. <\p>
Copy link
Contributor

@alanbernstein alanbernstein Jul 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the backticks (around jq) don't work inside the div note section. <code>jq</code> would work. It might also be nice to have that be a link to the program site (https://stedolan.github.io/jq/)

``` request
curl localhost:10101/index/repository -X POST
```
``` response
{"success":true}
```
The index name must be 64 characters or less, start with a letter, and consist only of lowercase alphanumeric characters or `_-`.
The index name must be 64 characters or less, start with a letter, and consist only of lowercase alphanumeric characters or `_-`. The same goes for field names.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a grammar rule that I can never remember properly, but I think the "less" here should be "fewer". If you can verify that, let's change this and other occurrences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. Dictionary.com says that fewer should be used for countable things while less should be used for "singular mass nouns." For example, fewer ingredients and less salt.

.build();
Field stargazer = repository.field("stargazer", stargazerOptions);
```
Since our data contains time stamps which represent the time users starred repos, we set the field type to `time` using `fieldTime()`. Time quantum is the resolution of the time we want to use, and we set it to `YEAR_MONTH-DAY` for `stargazer`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YEAR_MONTH-DAY should be YEAR_MONTH_DAY

python3 -m venv startrace
```

Next, we activate the python environment we created and install the requirements (and python-pilosa):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for the "and" in "and python-pilosa". the requirements file contains only a single dependency, which is the python package called "pilosa", implemented in the repository named "python-pilosa".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should it be:
Next we activate the python environment we created and install the requirements for python-pilosa.
?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Next, we activate the python environment we created and install the single dependency, python-pilosa"

or

"Next, we activate the python environment we created and install python-pilosa"

The requirements are for the current project, getting-started. The requirements are a list of dependencies, in this case including only one item.

Copy link
Contributor

@alanbernstein alanbernstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@asvetlik asvetlik merged commit c2cbadd into FeatureBaseDB:master Jul 3, 2019
@codysoyland codysoyland changed the title Getting Started Update Update "Getting Started" documentation Sep 17, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants