Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modified README.md and mkdocs to correct errors in grammer and spelling #161

Merged
merged 1 commit into from Jul 17, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
12 changes: 6 additions & 6 deletions README.md
@@ -1,5 +1,5 @@
<h1 align="center">
Kashgari
<a href='https://en.wikipedia.org/wiki/Mahmud_al-Kashgari'>Kashgari</a>
</h1>

<p align="center">
Expand Down Expand Up @@ -29,21 +29,21 @@
<a href="https://kashgari.bmio.net/about/contributing/">Contributing</a>
</h4>

🎉🎉🎉 We are proud to announce that we entirely rewrite Kashgari with tf.keras, now Kashgari comes with cleaner API and faster speed. 🎉🎉🎉
🎉🎉🎉 We are proud to announce that we entirely rewritten Kashgari with tf.keras, now Kashgari comes with easier to understand API and is faster! 🎉🎉🎉

## Overview

Kashgari is simple and powerful NLP Transfer learning framework, build a state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS) and text classification tasks.
Kashgari is a simple and powerful NLP Transfer learning framework, build a state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS), and text classification tasks.

- **Human-friendly**. Kashgari's code is straightforward, well documented and tested, which makes it very easy to understand and modify.
- **Powerful and simple**. Kashgari allows you to apply state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), part-of-speech tagging (PoS) and classification.
- **Buildin transfer learning**. Kashgari build-in pre-trained BERT and Word2vec embedding models, which makes it very simple to transfer learning to train your model.
- **Built-in transfer learning**. Kashgari built-in pre-trained BERT and Word2vec embedding models, which makes it very simple to transfer learning to train your model.
- **Fully scalable**. Kashgari provide a simple, fast, and scalable environment for fast experimentation, train your models and experiment with new approaches using different embeddings and model structure.
- **Product Ready**. Kashgari could export model with `SavedModel` format for tensorflow serving, you could directly deploy it on cloud.
- **Production Ready**. Kashgari could export model with `SavedModel` format for tensorflow serving, you could directly deploy it on cloud.

## Our Goal

- **Academic users** Experiments to prove their hypothesis without coding from scratch.
- **Academic users** Easier Experimentation to prove their hypothesis without coding from scratch.
- **NLP beginners** Learn how to build an NLP project with production level code quality.
- **NLP developers** Build a production level classification/labeling model within minutes.

Expand Down
4 changes: 2 additions & 2 deletions mkdocs/docs/about/contributing.md
@@ -1,6 +1,6 @@
# Contributing &amp; Support

We are happy to accept your contributions to make `Kashgari` better and more awesome! You could contribute in various ways:
We are happy to accept contributions that make `Kashgari` better and more awesome! You could contribute in various ways:

## Bug Reports

Expand All @@ -25,7 +25,7 @@ Take part in reviewing pull requests and/or reviewing direct commits. Make sugg

## Answer Questions in Issues

Take time and answer questions and offer suggestions to people who've created issues in the issue tracker. Often people will have questions that you might have an answer for. Or maybe you know how to help them accomplish a specific task they are asking about. Feel free to share your experience to help others out.
Take time and answer questions and offer suggestions to people who've created issues in the issue tracker. Often people will have questions that you might have an answer for. Or maybe you know how to help them accomplish a specific task they are asking about. Feel free to share your experience with others to help them out.

## Pull Requests

Expand Down
14 changes: 7 additions & 7 deletions mkdocs/docs/advance-use/handle-numeric-features.md
Expand Up @@ -4,8 +4,8 @@

https://github.com/BrikerMan/Kashgari/issues/90

Some time, except the text, we have some additional features like text formatting (italic, bold, centered),
position in text and more. Kashgari provides `NumericFeaturesEmbedding` and `StackedEmbedding` for this kine data. Here is the details.
At times there have some additional features like text formatting (italic, bold, centered),
position in text and more. Kashgari provides `NumericFeaturesEmbedding` and `StackedEmbedding` for this kind data. Here is the details:

If you have a dataset like this.

Expand Down Expand Up @@ -51,7 +51,7 @@ label_list = [label] * 100

SEQUENCE_LEN = 100

# You can use WordEmbedding or BERTEmbedding for your text embedding
# You can use Word Embedding or BERT Embedding for your text embedding
text_embedding = BareEmbedding(task=kashgari.LABELING, sequence_length=SEQUENCE_LEN)
start_of_p_embedding = NumericFeaturesEmbedding(feature_count=2,
feature_name='start_of_p',
Expand All @@ -65,7 +65,7 @@ center_embedding = NumericFeaturesEmbedding(feature_count=2,
feature_name='center',
sequence_length=SEQUENCE_LEN)

# first one must be the text embedding
# first attribute, must be the text embedding
stack_embedding = StackedEmbedding([
text_embedding,
start_of_p_embedding,
Expand All @@ -77,11 +77,11 @@ x = (text_list, start_of_p_list, bold_list, center_list)
y = label_list
stack_embedding.analyze_corpus(x, y)

# Now we can embed with this stacked embedding layer
# Now we can embed using this stacked embedding layer
print(stack_embedding.embed(x))
```

Once embedding layer prepared, you could use all of the classification and labeling models.
Once the embedding layer prepared, you can use all of the classification and labeling models.

```python
# We can build any labeling model with this embedding
Expand All @@ -94,6 +94,6 @@ print(model.predict(x))
print(model.predict_entities(x))
```

This is the struct of this model.
This is the structurer of this model.

![](../static/images/multi_feature_model.png)
4 changes: 2 additions & 2 deletions mkdocs/docs/advance-use/multi-output-model.md
Expand Up @@ -28,7 +28,7 @@ output_2 = [
[0. 0. 1.]]
```

Then you need to create a customized processor inhered from the `ClassificationProcessor`.
Then you need to create a customized processor inherited from the `ClassificationProcessor`.

```python
import kashgari
Expand Down Expand Up @@ -57,7 +57,7 @@ class MultiOutputProcessor(ClassificationProcessor):
return tuple(result)
```

Then build your own model inhered from the `BaseClassificationModel`
Then build your own model inherited from the `BaseClassificationModel`

```python
import kashgari
Expand Down
30 changes: 15 additions & 15 deletions mkdocs/docs/api/tasks.labeling.md
Expand Up @@ -84,9 +84,9 @@ def build_model(self,

__Args__:

- **x_train**: Array of train feature data (if the model has a single input),
or tuple of train feature data array (if the model has multiple inputs)
- **y_train**: Array of train label data
- **x_train**: Array of training feature data (if the model has a single input),
or tuple of training feature data array (if the model has multiple inputs)
- **y_train**: Array of training label data
- **x_validate**: Array of validation feature data (if the model has a single input),
or tuple of validation feature data array (if the model has multiple inputs)
- **y_validate**: Array of validation label data
Expand Down Expand Up @@ -114,9 +114,9 @@ __Args__:
- **cpu_relocation**: A boolean value to identify whether to create the model's weights
under the scope of the CPU. If the model is not defined under any preceding device
scope, you can still rescue it by activating this option.
- **x_train**: Array of train feature data (if the model has a single input),
or tuple of train feature data array (if the model has multiple inputs)
- **y_train**: Array of train label data
- **x_train**: Array of training feature data (if the model has a single input),
or tuple of training feature data array (if the model has multiple inputs)
- **y_train**: Array of training label data
- **x_validate**: Array of validation feature data (if the model has a single input),
or tuple of validation feature data array (if the model has multiple inputs)
- **y_validate**: Array of validation label data
Expand All @@ -137,9 +137,9 @@ __Args__:

- **strategy**: `TPUDistributionStrategy`. The strategy to use for replicating model
across multiple TPU cores.
- **x_train**: Array of train feature data (if the model has a single input),
or tuple of train feature data array (if the model has multiple inputs)
- **y_train**: Array of train label data
- **x_train**: Array of training feature data (if the model has a single input),
or tuple of training feature data array (if the model has multiple inputs)
- **y_train**: Array of training label data
- **x_validate**: Array of validation feature data (if the model has a single input),
or tuple of validation feature data array (if the model has multiple inputs)
- **y_validate**: Array of validation label data
Expand Down Expand Up @@ -207,9 +207,9 @@ def fit(self,

__Args__:

- **x_train**: Array of train feature data (if the model has a single input),
or tuple of train feature data array (if the model has multiple inputs)
- **y_train**: Array of train label data
- **x_train**: Array of training feature data (if the model has a single input),
or tuple of training feature data array (if the model has multiple inputs)
- **y_train**: Array of training label data
- **x_validate**: Array of validation feature data (if the model has a single input),
or tuple of validation feature data array (if the model has multiple inputs)
- **y_validate**: Array of validation label data
Expand Down Expand Up @@ -241,9 +241,9 @@ def fit_without_generator(self,

__Args__:

- **x_train**: Array of train feature data (if the model has a single input),
or tuple of train feature data array (if the model has multiple inputs)
- **y_train**: Array of train label data
- **x_train**: Array of training feature data (if the model has a single input),
or tuple of training feature data array (if the model has multiple inputs)
- **y_train**: Array of training label data
- **x_validate**: Array of validation feature data (if the model has a single input),
or tuple of validation feature data array (if the model has multiple inputs)
- **y_validate**: Array of validation label data
Expand Down
30 changes: 16 additions & 14 deletions mkdocs/docs/index.md
@@ -1,5 +1,5 @@
<h1 align="center" >
<strong style="color: rgba(0,0,0,.87);">Kashgari</strong>
<a href='https://en.wikipedia.org/wiki/Mahmud_al-Kashgari'><strong style="color: rgba(0,0,0,.87);">Kashgari</strong></a>
</h1>

<p align="center">
Expand All @@ -20,34 +20,37 @@
</a>
</p>

🎉🎉🎉 We are proud to announce that we entirely rewrite Kashgari with tf.keras, now Kashgari comes with cleaner API and faster speed. 🎉🎉🎉
🎉🎉🎉 We are proud to announce that we entirely rewritten Kashgari with tf.keras, now Kashgari comes with easier to understand API and is faster! 🎉🎉🎉

Kashgari is simple and powerful NLP Transfer learning framework, build a state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS) and text classification tasks.
## Overview

Kashgari is a simple and powerful NLP Transfer learning framework, build a state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS), and text classification tasks.

- **Human-friendly**. Kashgari's code is straightforward, well documented and tested, which makes it very easy to understand and modify.
- **Powerful and simple**. Kashgari allows you to apply state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), part-of-speech tagging (PoS) and classification.
- **Built-in transfer learning**. Kashgari built-in pre-trained BERT and Word2vec embedding models, which makes it very simple to transfer learning to train your model.
- **Fully scalable**. Kashgari provide a simple, fast, and scalable environment for fast experimentation, train your models and experiment with new approaches using different embeddings and model structure.
- **Product Ready**. Kashgari could export model with `SavedModel` format for tensorflow serving, you could directly deploy it on cloud.
- **Production Ready**. Kashgari could export model with `SavedModel` format for tensorflow serving, you could directly deploy it on cloud.

## Our Goal

- **Academic users** Experiments to prove their hypothesis without coding from scratch.
- **Academic users** Easier Experimentation to prove their hypothesis without coding from scratch.
- **NLP beginners** Learn how to build an NLP project with production level code quality.
- **NLP developers** Build a production level classification/labeling model within minutes.

## Performance

| Task | Language | Dataset | Score | Detail |
| ------------------------ | -------- | ------------------------- | -------------- | ---------------------------------------------------------------------------------- |
| Named Entity Recognition | Chinese | People's Daily Ner Corpus | **94.46** (F1) | [Text Labeling Performance Report](./tutorial/text-labeling.md#performance-report) |
| Task | Language | Dataset | Score | Detail |
| ------------------------ | -------- | ------------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------ |
| Named Entity Recognition | Chinese | People's Daily Ner Corpus | **94.46** (F1) | [Text Labeling Performance Report](https://kashgari.bmio.net/tutorial/text-labeling/#performance-report) |

## Tutorials

Here is a set of quick tutorials to get you started with the library:

- [Tutorial 1: Text Classification Model](tutorial/text-classification.md)
- [Tutorial 2: Text Labeling Model](tutorial/text-labeling.md)
- [Tutorial 1: Text Classification](https://kashgari.bmio.net/tutorial/text-classification/)
- [Tutorial 2: Text Labeling](https://kashgari.bmio.net/tutorial/text-labeling/)
- [Tutorial 3: Language Embedding](https://kashgari.bmio.net/embeddings/)

There are also articles and posts that illustrate how to use Kashgari:

Expand All @@ -61,8 +64,7 @@ There are also articles and posts that illustrate how to use Kashgari:

### Requirements and Installation

!!!important
We renamed the tf.keras version as `kashgari-tf`
🎉🎉🎉 We renamed the tf.keras version as **kashgari-tf** 🎉🎉🎉

The project is based on TenorFlow 1.14.0 and Python 3.6+, because it is 2019 and type hints is cool.

Expand All @@ -74,7 +76,7 @@ pip install tensorflow==1.14.0
pip install tensorflow-gpu==1.14.0
```

### Basic Usage
### Example Usage

lets run a NER labeling model with Bi_LSTM Model.

Expand Down Expand Up @@ -147,7 +149,7 @@ model.fit(train_x, train_y)

## Contributing

Thanks for your interest in contributing! There are many ways to get involved; start with the [contributor guidelines](about/contributing.md) and then check these open issues for specific tasks.
Thanks for your interest in contributing! There are many ways to get involved; start with the [contributor guidelines](https://kashgari.bmio.net/about/contributing/) and then check these open issues for specific tasks.

## Reference

Expand Down
3 changes: 1 addition & 2 deletions mkdocs/docs/tutorial/text-classification.md
Expand Up @@ -170,8 +170,7 @@ model.fit(x, y)
## Customize your own model

It is very easy and straightforward to build your own customized model,
just inherit the `BaseClassificationModel` and implement the `get_default_hyper_parameters()` function
and `build_model_arc()` function.
just inherit the `BaseClassificationModel` and implement the `get_default_hyper_parameters()` function and `build_model_arc()` function.

```python
from typing import Dict, Any
Expand Down