Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add docs for using dev container and fix typo #2711

Merged
merged 1 commit into from
May 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -316,7 +316,7 @@ sess.close()

This operation will notify the backend engines and vineyard
to safely unload graphs and their applications,
Then, the coordinator will dealloc all the applied resources in the k8s cluster.
Then, the coordinator will release all the applied resources in the k8s cluster.

Please note that we have not hardened this release for production use and it lacks important security features such as authentication and encryption, and therefore **it is NOT recommended for production use (yet)!**

Expand Down
4 changes: 2 additions & 2 deletions docs/analytical_engine/builtin_algorithms.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ Compute closeness centrality for vertices.

## Clustering

The clustering algorithm is to compute the clustering coefficient for each vertex of a graph. The clustering coefficient of a vertex in a graph quantifies how close its neighbours are to being a clique (complete graph).
The clustering algorithm is to compute the clustering coefficient for each vertex of a graph. The clustering coefficient of a vertex in a graph quantifies how close its neighbors are to being a clique (complete graph).

```{py:function} clustering()

Expand Down Expand Up @@ -275,7 +275,7 @@ Compute shortest paths from a source vertex in the graph.

## VoteRank

VoteRank is to measure a ranking of the vertices in a graph based on a voting scheme. With VoteRank, all vertices vote for each of its in-neighbours and the vertices with the top highest votes is elected iteratively.
VoteRank is to measure a ranking of the vertices in a graph based on a voting scheme. With VoteRank, all vertices vote for each of its in-neighbors and the vertices with the top highest votes is elected iteratively.

```{py:function} voterank(num_of_nodes)

Expand Down
2 changes: 1 addition & 1 deletion docs/analytical_engine/customized_algorithms.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Customized Algorithms

In addtion to the built-in algorithms, GraphScope also provides a way to run your own algorithms on the analytical engine. The analytical engine is designed to be extensible, hence you can implement your own algorithms in PIE or FLASH model, in Java、C++ or Python, and run them on GraphScope.
In addition to the built-in algorithms, GraphScope also provides a way to run your own algorithms on the analytical engine. The analytical engine is designed to be extensible, hence you can implement your own algorithms in PIE or FLASH model, in Java、C++ or Python, and run them on GraphScope.



Expand Down
3 changes: 1 addition & 2 deletions docs/analytical_engine/dev_and_test.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,13 +30,12 @@ Together with the `grape_engine` are shared libraries, or there may have a bunch
You could install it to a location by

```bash
./gs make analytcial-install --install-prefix /usr/local
./gs make analytical-install --install-prefix /usr/local
```

````{note}
More in-depth view:

These options are repeated items that would be directed forwared as `cmake` options.
The `CMakeLists.txt` of analytical engine is in `analytical_engine/CMakeLists.txt`.

Take a look at this file if you want to investigate more of the analytical engine.
Expand Down
2 changes: 1 addition & 1 deletion docs/analytical_engine/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,5 +58,5 @@ As shown in the above example, it is very easy to use GraphScope to analyze a gr
Next, you may want to learn more about the following topics:

- [Design of the analytical engine of GraphScope and its technical details](analytical_engine/design_of_gae)
- [Disaggregate deployment of GraphScope on a k8s cluster for large-scale graph analysis](analytical_engine/deployment)
- [Disaggregated deployment of GraphScope on a k8s cluster for large-scale graph analysis](analytical_engine/deployment)
- [A set of examples with advanced usage, including customized algorithms, NetworkX/Giraph/GraphX compatibility, etc.](analytical_engine/guide_and_exmaples)
2 changes: 1 addition & 1 deletion docs/analytical_engine/guide_and_examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ Write and run algorithms in Python
Write and run algorithms in Java with PIE and Pregel model
````

Better still, if you already have your application running on Giraph or GraphX, the packaged `jar` can directly run on GraphScope. The migration is totally transparent, you even don't need to have the sourcecode!
Better still, if you already have your application running on Giraph or GraphX, the packaged `jar` can directly run on GraphScope. The migration is totally transparent, you even don't need to have the source code!

````{panels}
:header: text-center
Expand Down
4 changes: 2 additions & 2 deletions docs/analytical_engine/overview_and_architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ GAE also supports incremental computation over graph data via the [Ingress](http

GAE provides C++, Python and Java SDKs for graph applications, where users can freely choose programming models, programming languages, and computation patterns (batch computation or incremental computation) to develop their own applications. GAE of GraphScope also provides [20 graph analytics algorithms](https://graphscope.io/docs/latest/analytical_engine/builtin_algorithms.html) as built-in algorithms, and users can directly invoke them. GraphScope is compatible with NetworkX APIs, and thus diverse kinds of [built-in algorithms in NetworkX](https://networkx.org/documentation/stable/reference/algorithms/index.html) can also be directly invoked by users. In total, over 100 build-in graph analytical algorithms can be directly executed over GraphScope. Additionally, the support for the Pregel model has been implemented in GAE, and graph algorithms implemented in Giraph or GraphX can also be directly run on GAE. Please refer to the following tutorials on how to run NetworkX/Giraph/GraphX applications on GAE.

- [Tutorial: Graph Operations with NetowrkX APIs](https://graphscope.io/docs/latest/analytical_engine/tutorial_networkx_operations.html)
- [Tutorial: Graph Algorithms with NetowrkX APIs](https://graphscope.io/docs/latest/analytical_engine/tutorial_networkx_algorithms.html)
- [Tutorial: Graph Operations with NetworkX APIs](https://graphscope.io/docs/latest/analytical_engine/tutorial_networkx_operations.html)
- [Tutorial: Graph Algorithms with NetworkX APIs](https://graphscope.io/docs/latest/analytical_engine/tutorial_networkx_algorithms.html)
- [Tutorial: Run Giraph Applications on GraphScope](https://graphscope.io/docs/latest/analytical_engine/tutorial_run_giraph_apps.html)
- [Tutorial: Run GraphX Applications on GraphScope](https://graphscope.io/docs/latest/analytical_engine/tutorial_run_graphx_apps.html)
4 changes: 2 additions & 2 deletions docs/analytical_engine/tutorial_dev_algo_cpp_pie.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ class MyApp : public grape::ParallelAppBase<FRAG_T, MyAppContext<FRAG_T>>,
};
```

The `MyApp` class inherits from the `grape::ParallelAppBase`, which provides the basic functionality for implementing a parallel graph algorithm. It also inherits from the `grape::ParallelEngine` and `grape::Communicator` classes, which provide the communication and parallel processing capabilities. The MyApp class defines two static constexpr variables called `message_strategy` and `load_strategy`, these variables specify the message strategy and load strategy used in the computation. For more information please refer to the [libgrape-lite Doc](https://alibaba.github.io/libgrape-lite).
The `MyApp` class inherits from the `grape::ParallelAppBase`, which provides the basic functionality for implementing a parallel graph algorithm. It also inherits from the `grape::ParallelEngine` and `grape::Communicator` classes, which provide the communication and parallel processing capabilities. The MyApp class defines two `static constexpr` variables called `message_strategy` and `load_strategy`, these variables specify the message strategy and load strategy used in the computation. For more information please refer to the [libgrape-lite Doc](https://alibaba.github.io/libgrape-lite).

The `PEval` method is used to implement the partial evaluation phase of the computation. In current example, we initialize the communication channels and do nothing else, instead, we put the computing logic into `IncEval` method.

Expand Down Expand Up @@ -121,7 +121,7 @@ the codebase structure is as follows:
└── .gs_conf.yaml ➝ configuration file
```

then, we package the algorithm by comand: ` zip -jr 'my_app.gar' '*.h' ''.gs_conf.yaml'`
then, we package the algorithm by command: ` zip -jr 'my_app.gar' '*.h' ''.gs_conf.yaml'`

## Step 4: Run the .gar file on GraphScope

Expand Down
6 changes: 3 additions & 3 deletions docs/analytical_engine/tutorial_dev_algo_java.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ In this tutorial, you will first try to explore GraphScope JavaSDK with some exa

## Run example algorithms with example jar

An example jar which constains implementation of several graph algorithms(i.e. PageRank, SSSP, BFS) is provided in
An example jar which contains implementation of several graph algorithms(i.e. PageRank, SSSP, BFS) is provided in
[grape-demo.jar](https://graphscope.oss-cn-beijing.aliyuncs.com/jar/grape-demo-0.19.0-shaded.jar). You can run the graph algorithms provided in this jar by submitting the downloaded jar to GraphScope.

Here we provide an example to run `SSSP` on p2p dataset.
Expand Down Expand Up @@ -69,7 +69,7 @@ Different from the *pregel* interface provided by **Apache Giraph** and **Spark
models graph computing in a ``subgraph-centric`` manner.
In `PIE` model, the program requires less supersteps and the size of generated message has been drastically reduced, which lead to great performance improvement.

To implement a `PIE` algorithm, you need to provide two separate functions, `PEval` and `IncEval`. `PEval` function will be execute only once at the first round of computation, and `IncEval` will be called for multiple times untile covergence. You are also supposed to provide a class called `Context`. You can put intermediate
To implement a `PIE` algorithm, you need to provide two separate functions, `PEval` and `IncEval`. `PEval` function will be execute only once at the first round of computation, and `IncEval` will be called for multiple times until convergence. You are also supposed to provide a class called `Context`. You can put intermediate
results, init configuration in this class. The `init` method will be called before
`PEval`.

Expand Down Expand Up @@ -163,7 +163,7 @@ import graphscope
from graphscope import JavaApp
from graphscope.dataset import load_p2p_network

"""Or lauch session in k8s cluster"""
"""Or launch session in k8s cluster"""
sess = graphscope.session(cluster_type='hosts')


Expand Down
6 changes: 3 additions & 3 deletions docs/analytical_engine/tutorial_networkx_operations.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ or by adding any ebunch of edges. An ebunch is any iterable container of edge-tu

```python
G.add_edges_from([(2, 3, {"weight": 3.1415})])
list(G.edges.data()) # shows the edge arrtibutes
list(G.edges.data()) # shows the edge attributes
G.add_edges_from(H.edges)
list(G.edges)
```
Expand Down Expand Up @@ -313,7 +313,7 @@ list(DG.predecessors(1))
Some algorithms work only for directed graphs and others are not well defined for directed graphs. Indeed the tendency to lump directed and undirected graphs together is dangerous. If you want to treat a directed graph as undirected for some measurement you should probably convert it using `Graph.to_undirected()`

```python
H = DG.to_undirected() # return a "deepcopy" of undirected represetation of DG.
H = DG.to_undirected() # return a "deepcopy" of undirected representation of DG.
list(H.edges)

# or with
Expand All @@ -324,7 +324,7 @@ list(H.edges)
Directed graph also supports to reverse edge using `DiGraph.reverse()`.

```python
K = DG.reverse() # retrun a "deepcopy" of reversed copy.
K = DG.reverse() # return a "deepcopy" of reversed copy.
list(K.edges)

# or with
Expand Down
2 changes: 1 addition & 1 deletion docs/analytical_engine/tutorial_run_giraph_apps.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ import graphscope
import os
from graphscope.framework.app import load_app

"""Or lauch session in k8s cluster"""
"""Or launch session in k8s cluster"""
sess = graphscope.session(cluster_type='hosts')

sess.add_lib("/home/graphscope/grape-demo-0.19.0-shaded.jar")
Expand Down
16 changes: 8 additions & 8 deletions docs/analytical_engine/tutorial_run_graphx_apps.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
[Apache Spark](https://spark.apache.org/) is a famous engine for large-scale data analytics. [Spark GraphX](https://spark.apache.org/graphx/) is Spark's graph
computing module, which provides flexible and efficient graph computation framework.

Graphscope is also developed to be integrated with Spark GraphX. User can easily deploy a graphscope cluster colocated with spark cluster. And by switch `SparkSession` to `GSSparkSession`, user can experience up to 7 times performance
Graphscope is also developed to be integrated with Spark GraphX. User can easily deploy a graphscope cluster co-located with spark cluster. And by switch `SparkSession` to `GSSparkSession`, user can experience up to 7 times performance
improvement when running GraphX algorithms.

## Deploy GraphScope along with Spark
Expand All @@ -12,9 +12,9 @@ We assume you already have a spark cluster deployed. If you don't have a spark c
Spark distributions with **version ==3.1.3** has been tested to be compatible with GraphScope.

Also, GraphScope can be easily distributed with python package. Since GraphScope only
support python3, you shall upgrade your python enviroment before proceeding on.
support python3, you shall upgrade your python environment before proceeding on.

Then, on client side, we will use `venv` to create a virtual enviroment pack which contains graphscope package.
Then, on client side, we will use `venv` to create a virtual environment pack which contains graphscope package.

```bash
pip3 install virtualenv venv-pack
Expand All @@ -24,7 +24,7 @@ pip3 install graphscope
venv-pack -o pyspark_venv_gs.tar.gz
```

Now, `pyspark_venv_gs.tar.gz` contains neccessary enviroments graphscope need. Every time
Now, `pyspark_venv_gs.tar.gz` contains necessary environments graphscope need. Every time
you submit a job to your spark cluster, remember to upload this pack.

```bash
Expand Down Expand Up @@ -65,7 +65,7 @@ export GS_JARS=`ls ${GRAPHSCOPE_HOME}/lib/grape-graphx-*.jar`:`ls ${GRAPHSCOPE_H
/home/graphscope/grape-demo-0.19.0-shaded.jar /home/graphscope/p2p-31.e 2 1
```

Remember to replace the placeholders like `${master_url}` with acutal cluster url.
Remember to replace the placeholders like `${master_url}` with actual cluster url.

## Run customized GraphX apps

Expand Down Expand Up @@ -124,11 +124,11 @@ And you also need to configure `maven-shaded-plugin` with following configuratio


Other than the interface provided by GraphX, GraphScope also provide some other graphscope-only features
via `GSSparkSession`. User shall use `GSSparkSession` insteadof `SparkSession` to make their algorithm runnable on GraphScope.
via `GSSparkSession`. User shall use `GSSparkSession` instead of `SparkSession` to make their algorithm runnable on GraphScope.

`GSSparkSession` extends `SparkSession` with following new methods.
```scala
/** GraphgScope related param, setting vineyard memroy size.
/** GraphScope related param, setting vineyard memory size.
*/
def vineyardMemory(memoryStr: String): Builder =
config("spark.gs.vineyard.memory", memoryStr)
Expand Down Expand Up @@ -164,4 +164,4 @@ def loadGraphToGS[VD: ClassTag, ED: ClassTag](

### Run customized GraphX algorithms on Spark with GraphScope support

Great performace improvement is observed when running graphx algorithms on GraphScope other than GraphX. To enable GraphScope support, just add necessary arguments to spark-submit shell when submit your job, like [Submit example GraphX app to Spark](#submit-to-spark). Just remember to to change jar name, app name and params.
Great performance improvement is observed when running graphx algorithms on GraphScope other than GraphX. To enable GraphScope support, just add necessary arguments to spark-submit shell when submit your job, like [Submit example GraphX app to Spark](#submit-to-spark). Just remember to to change jar name, app name and params.
10 changes: 5 additions & 5 deletions docs/deployment/deploy_graphscope_on_self_managed_k8s.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Deploy GraphScope on K8s cluster

To processing large-scale graph distributedly, GraphScope is designed to be deployed on a Kubernetes(K8s) cluster.
To processing large-scale graph in a distributed environment, GraphScope is designed to be deployed on a Kubernetes(K8s) cluster.

As shown in the figure, you could deploy and manage the workloads of GraphScope through a python client, which communicates with the
GraphScope engines on the K8s cluster through a gRPC service.
Expand Down Expand Up @@ -131,7 +131,7 @@ import graphscope

sess = graphscope.session()
```
As default, it will look for a kubeconfig file in `~/.kube/config`, the file generated by minikube in the previous step will be used.
As default, it will look for a `kubeconfig` file in `~/.kube/config`, the file generated by minikube in the previous step will be used.

As shown above, a session can easily launch a cluster on k8s.

Expand Down Expand Up @@ -169,10 +169,10 @@ sess = graphscope.session(
)
```

#### Provide a kubeconfig file other than default
#### Provide a `kubeconfig` file other than default

If you want to deploy on a pre-existing cluster with a kubeconfig file located in a non-default location,
they can manually specify the path to the kubeconfig file as follows:
If you want to deploy on a pre-existing cluster with a `kubeconfig` file located in a non-default location,
they can manually specify the path to the `kubeconfig` file as follows:

```python
sess = graphscope.session(k8s_client_config='/path/to/config')
Expand Down
8 changes: 4 additions & 4 deletions docs/deployment/deploy_graphscope_with_helm.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ g = sess.g()
interactive = graphscope.gremlin(g)
```

The param `addr` is an endpoint for connecting a pre-launched service. The `<ip>` and `<port>` is the connection informations you get from previous step.
The param `addr` is an endpoint for connecting a pre-launched service. The `<ip>` and `<port>` is the connection information you get from previous step.

Note that only one session can be connected to the service at the same time, but you can reconnect the same service after session close.

Expand All @@ -92,8 +92,8 @@ To remove the resources, use `helm uninstall`. See next section for details.
````

```python
# sess1 = graphscope.session(addr='<ip>:<port>')
sess1.close()
# sess = graphscope.session(addr='<ip>:<port>')
sess.close()
sess2 = graphscope.session(addr='<ip>:<port>')
```

Expand Down Expand Up @@ -123,7 +123,7 @@ And you could see more details in the [homepage](https://artifacthub.io/packages


## Offline Installation
While it's convenient to install graphscope by quering the remote repository, users may want to use it in some environment that doesn't have internet access.
While it's convenient to install graphscope by querying the remote repository, users may want to use it in some environment that doesn't have internet access.
Or user may want to make some customization of the charts before installation.

To cater these needs, We provide two ways to install graphscope by helm without internet access.
Expand Down
Loading