Skip to content
Permalink
Browse files
update to en doc
  • Loading branch information
imbajin committed May 6, 2022
1 parent 8d4f440 commit dec619a5fea5c0c0c8481ce15736171442959dc9
Showing 1 changed file with 39 additions and 44 deletions.
@@ -1,61 +1,56 @@
## Welcome to HugeGraph
## Introduction of HugeGraph

### Summary

HugeGraph是一款易用、高效、通用的开源图数据库系统(Graph Database,[GitHub项目地址](https://github.com/hugegraph/hugegraph)),
实现了[Apache TinkerPop3](https://tinkerpop.apache.org)框架及完全兼容[Gremlin](https://tinkerpop.apache.org/gremlin.html)查询语言,
具备完善的工具链组件,助力用户轻松构建基于图数据库之上的应用和产品。HugeGraph支持百亿以上的顶点和边快速导入,并提供毫秒级的关联关系查询能力(OLTP),
并可与Hadoop、Spark等大数据平台集成以进行离线分析(OLAP)。
HugeGraph is an easy-to-use, efficient, general-purpose open source graph database system(Graph Database, [GitHub project address](https://github.com/hugegraph/hugegraph)),
implemented the [Apache TinkerPop3](https://tinkerpop.apache.org) framework and is fully compatible with the [Gremlin](https://tinkerpop.apache.org/gremlin.html) query language,
With complete toolchain components, it helps users to easily build applications and products based on graph databases. HugeGraph supports fast import of more than 10 billion vertices and edges, and provides millisecond-level relational query capability (OLTP).
It can be integrated with big data platforms such as Hadoop and Spark for offline analysis (OLAP).

HugeGraph典型应用场景包括深度关系探索、关联分析、路径搜索、特征抽取、数据聚类、社区检测、
知识图谱等,适用业务领域有如网络安全、电信诈骗、金融风控、广告推荐、社交网络和智能机器人等。
Typical application scenarios of HugeGraph include deep relationship exploration, association analysis, path search, feature extraction, data clustering, community detection, knowledge graph, etc., and are applicable to business fields such as network security, telecommunication fraud, financial risk control, advertising recommendation, social network and intelligence Robots etc.

本系统的主要应用场景是解决百度安全事业部所面对的反欺诈、威胁情报、黑产打击等业务的图数据存储和建模分析需求,在此基础上逐步扩展及支持了更多的通用图应用。
Typical application scenarios of HugeGraph include deep relationship exploration, association analysis, path search, feature extraction, data clustering, community detection, knowledge graph, etc., and are applicable to business fields such as network security, telecommunication fraud, financial risk control, advertising recommendation, social network and intelligence Robots etc.

### Features

HugeGraph支持在线及离线环境下的图操作,支持批量导入数据,支持高效的复杂关联关系分析,并且能够与大数据平台无缝集成。
HugeGraph支持多用户并行操作,用户可输入Gremlin查询语句,并及时得到图查询结果,也可在用户程序中调用HugeGraph API进行图分析或查询。
HugeGraph supports graph operations in online and offline environments, supports batch import of data, supports efficient complex relationship analysis, and can be seamlessly integrated with big data platforms.
HugeGraph supports multi-user parallel operations. Users can enter Gremlin query statements and get graph query results in time. They can also call HugeGraph API in user programs for graph analysis or query.

本系统具备如下特点:
This system has the following features:

- 易用:HugeGraph支持Gremlin图查询语言与RESTful API,同时提供图检索常用接口,具备功能齐全的周边工具,轻松实现基于图的各种查询分析运算。
- 高效:HugeGraph在图存储和图计算方面做了深度优化,提供多种批量导入工具,轻松完成百亿级数据快速导入,通过优化过的查询达到图检索的毫秒级响应。支持数千用户并发的在线实时操作。
- 通用:HugeGraph支持Apache Gremlin标准图查询语言和Property Graph标准图建模方法,支持基于图的OLTP和OLAP方案。集成Apache Hadoop及Apache Spark大数据平台。
- 可扩展:支持分布式存储、数据多副本及横向扩容,内置多种后端存储引擎,也可插件式轻松扩展后端存储引擎。
- 开放:HugeGraph代码开源(Apache 2 License),客户可自主修改定制,选择性回馈开源社区。
- Ease of use: HugeGraph supports Gremlin graph query language and RESTful API, provides common interfaces for graph retrieval, and has peripheral tools with complete functions to easily implement various graph-based query and analysis operations.
- Efficiency: HugeGraph has been deeply optimized in graph storage and graph computing, and provides a variety of batch import tools, which can easily complete the rapid import of tens of billions of data, and achieve millisecond-level response for graph retrieval through optimized queries. Supports simultaneous online real-time operations of thousands of users.
- Universal: HugeGraph supports the Apache Gremlin standard graph query language and the Property Graph standard graph modeling method, and supports graph-based OLTP and OLAP schemes. Integrate Apache Hadoop and Apache Spark big data platform.
- Scalable: supports distributed storage, multiple copies of data and horizontal expansion, built-in multiple back-end storage engines, and can easily expand the back-end storage engine through plug-ins.
- Open: HugeGraph code is open source (Apache 2 License), customers can modify and customize independently, and selectively give back to the open source community.

本系统的功能包括但不限于:
The functions of this system include but are not limited to:

- 支持从多数据源批量导入数据(包括本地文件、HDFS文件、MySQL数据库等数据源),支持多种文件格式导入(包括TXT、CSV、JSON等格式)
- 具备可视化操作界面,可用于操作、分析及展示图,降低用户使用门槛
- 优化的图接口:最短路径(Shortest Path)、K步连通子图(K-neighbor)、K步到达邻接点(K-out)、个性化推荐算法PersonalRank等
- 基于Apache TinkerPop3框架实现,支持Gremlin图查询语言
- 支持属性图,顶点和边均可添加属性,支持丰富的属性类型
- 具备独立的Schema元数据信息,拥有强大的图建模能力,方便第三方系统集成
- 支持多顶点ID策略:支持主键ID、支持自动生成ID、支持用户自定义字符串ID、支持用户自定义数字ID
- 可以对边和顶点的属性建立索引,支持精确查询、范围查询、全文检索
- 存储系统采用插件方式,支持RocksDB、CassandraScyllaDBHBaseMySQLPostgreSQL、Palo以及InMemory等
- 与Hadoop、Spark GraphX等大数据系统集成,支持Bulk Load操作
- 支持高可用HA、数据多副本、备份恢复、监控等
- Supports batch import of data from multiple data sources (including local files, HDFS files, MySQL databases and other data sources), and supports import of multiple file formats (including TXT, CSV, JSON and other formats)
- With a visual operation interface, it can be used for operation, analysis and display diagrams, reducing the threshold for users to use
- Optimized graph interface: shortest path (Shortest Path), K-step connected subgraph (K-neighbor), K-step to reach the adjacent point (K-out), personalized recommendation algorithm PersonalRank, etc.
- Implemented based on Apache TinkerPop3 framework, supports Gremlin graph query language
- Support attribute graph, attributes can be added to vertices and edges, and support rich attribute types
- Has independent schema metadata information, has powerful graph modeling capabilities, and facilitates third-party system integration
- Support multi-vertex ID strategy: support primary key ID, support automatic ID generation, support user-defined string ID, support user-defined digital ID
- The attributes of edges and vertices can be indexed to support precise query, range query, and full-text search
- The storage system adopts plug-in mode, supporting RocksDB, Cassandra, ScyllaDB, HBase, MySQL, PostgreSQL, Palo, and InMemory, etc.
- Integrate with big data systems such as Hadoop and Spark GraphX, and support Bulk Load operations
- Support high availability HA, multiple copies of data, backup recovery, monitoring, etc.

### Modules

- [HugeGraph-Server](quickstart/hugegraph-server.md): HugeGraph-Server是HugeGraph项目的核心部分,包含Core、Backend、API等子模块;
- Core:图引擎实现,向下连接Backend模块,向上支持API模块;
- Backend:实现将图数据存储到后端,支持的后端包括:MemoryCassandraScyllaDBRocksDBHBase、MySQL及PostgreSQL,用户根据实际情况选择一种即可;
- API:内置REST Server,向用户提供RESTful API,同时完全兼容Gremlin查询。
- [HugeGraph-Client](quickstart/hugegraph-client.md):HugeGraph-Client提供了RESTful API的客户端,用于连接HugeGraph-Server,目前仅实现Java版,其他语言用户可自行实现;
- [HugeGraph-Loader](quickstart/hugegraph-loader.md):HugeGraph-Loader是基于HugeGraph-Client的数据导入工具,将普通文本数据转化为图形的顶点和边并插入图形数据库中;
- [HugeGraph-Spark](quickstart/hugegraph-spark.md):HugeGraph-Spark能在图上做并行计算,例如PageRank算法等;
- [HugeGraph-Hubble](quickstart/hugegraph-hubble.md):HugeGraph-Hubble是HugeGraph的Web可视化管理平台,一站式可视化分析平台,平台涵盖了从数据建模,到数据快速导入,再到数据的在线、离线分析、以及图的统一管理的全过程;
- [HugeGraph-Tools](quickstart/hugegraph-tools.md):HugeGraph-Tools是HugeGraph的部署和管理工具,包括管理图、备份/恢复、Gremlin执行等功能。
- [HugeGraph-Server](/docs/quickstart/hugegraph-server): HugeGraph-Server is the core part of the HugeGraph project, including sub-modules such as Core, Backend, and API;
- Core: Graph engine implementation, connecting the Backend module downward and supporting the API module upward;
- Backend: Realize the storage of graph data to the backend. The supported backends include: Memory, Cassandra, ScyllaDB, RocksDB, HBase, MySQL and PostgreSQL. Users can choose one according to the actual situation;
- API: Built-in REST Server, provides RESTful API to users, and is fully compatible with Gremlin query.
- [HugeGraph-Client](/docs/quickstart/hugegraph-client): HugeGraph-Client provides a RESTful API client for connecting to HugeGraph-Server. Currently, only Java version is implemented. Users of other languages can implement it by themselves;
- [HugeGraph-Loader](/docs/quickstart/hugegraph-loader): HugeGraph-Loader is a data import tool based on HugeGraph-Client, which converts ordinary text data into graph vertices and edges and inserts them into graph database;
- [HugeGraph-Spark](/docs/quickstart/hugegraph-spark): HugeGraph-Spark can do parallel computing on graphs, such as PageRank algorithm, etc.;
- [HugeGraph-Hubble](/docs/quickstart/hugegraph-hubble): HugeGraph-Hubble is HugeGraph's web visualization management platform, a one-stop visual analysis platform. The platform covers the whole process from data modeling, to rapid data import, to online and offline analysis of data, and unified management of graphs;
- [HugeGraph-Tools](/docs/quickstart/hugegraph-tools): HugeGraph-Tools is HugeGraph's deployment and management tools, including functions such as managing graphs, backup/restore, Gremlin execution, etc.

### Contact Us

- 项目发起人:刘杰
- 项目负责人:韩祖利
- 技术负责人:[李章梅](https://github.com/javeme)
- 接口人:[张义](https://github.com/zhoney),[李凝瑞](https://github.com/Linary)
- 反馈邮箱:[hugegraph@googlegroups.com](mailto:hugegraph@googlegroups.com)
- 微信公众号:HugeGraph
- [Github Issues](https://github.com/apache/incubator-hugegraph/issues): Feedback on usage issues and functional requirements (priority)
- Feedback Email: [hugegraph@googlegroups.com](mailto:hugegraph@googlegroups.com)
- WeChat public account: HugeGraph

0 comments on commit dec619a

Please sign in to comment.