Skip to content

Commit

Permalink
fixed conflict
Browse files Browse the repository at this point in the history
  • Loading branch information
amaiya committed Jan 29, 2022
2 parents a871779 + ef43d36 commit 3de019a
Show file tree
Hide file tree
Showing 20 changed files with 502 additions and 119 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,9 @@ venv.bak/
# Rope project settings
.ropeproject

# VSCode project settings
.vscode

# mkdocs documentation
/site

Expand Down
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,22 @@ Most recent releases are shown at the top. Each release shows:
- **Changed**: Additional parameters, changes to inputs or outputs, etc
- **Fixed**: Bug fixes that don't change documented behaviour

## 0.29.0 (2022-01-28)

### new:
- New vision models: added MobileNetV3-Small and EfficientNet. Thanks to @ilos-vigil.

### changed
- `core.Learner.plot` now supports plotting of any value that exists in the training `History` object (e.g., `mae` if previously specified as metric). Thanks to @ilos-vigil.
- added `raw_confidence` parameter to `QA.ask` method to return raw confidence scores. Thanks to @ilos-vigil.

### fixed:
- pin to `transformers==4.10.3` due to Issue #398
- pin to `syntok==1.3.3` due to bug with `syntok==1.4.1` causing paragraph tokenization in `qa` module to break
- properly suppress TF/CUDA warnings by default
- ensure document fed to `keras_bert` tokenizer to avoid [this issue](https://stackoverflow.com/questions/67360987/bert-model-bug-encountered-during-training/67375675#67375675)


## 0.28.3 (2021-11-05)

### new:
Expand Down
37 changes: 37 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Contributing to ktrain

We are happy to accept your contributions to make `ktrain` better! To avoid unnecessary work, please stick to the following process:

1. Check if there is already [an issue](https://github.com/amaiya/ktrain/issues) for your concern.
2. If there is not, open a new one to start a discussion. We hate to close finished PRs.
3. We would be happy to accept a pull request, if it is decided that your concern requires a code change.


## Developing locally

We suggest cloning the repository and then checking out tutorials and examples for information on how to call various methods.
Most relevant classes and methods should be documented. If not, you might consider helping to improve the docstrings.

### Setup

See the [installation instructions](https://github.com/amaiya/ktrain#installation) for setting things up. Using virtual environment (such as [venv](https://docs.python.org/3/library/venv.html) and [Poetry](https://python-poetry.org/)) is strongly recommended.

### Tests

To run all tests, execute:
```bash
cd ktrain/tests
python3 -m unittest
```

To run a specific test (e.g., `test_dataloading.py`)
```bash
python3 test_dataloading.py
```

## PR Guidelines

- Keep each PR focused. While it's more convenient, please try to avoid combining several unrelated fixes together.
- Checkout to `develop` branch before make any changes. Make sure you choose `develop` branch as base on your PR.
- Try to maintain backwards compatibility. If this is not possible, please discuss with maintainer(s).
- Use four spaces for indentation.
15 changes: 13 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@


### News and Announcements
- **2022-01-28**
- **ktrain v0.29.x** is released and includes miscellaneous enhancements contributed by [Sandy Khosasi](https://github.com/ilos-vigil) such as [support for MobileNetV3 and EfficientNet](https://colab.research.google.com/drive/1EJHpMVG6fBCg33UPla_Ly_6LQdswU2Ur?usp=sharing), [plotting improvements](https://colab.research.google.com/drive/1_WaRQ0J4g0VTn6HWS3kszdFZbBBWoa7R?usp=sharing), and [raw confidence scores in QA](https://colab.research.google.com/drive/1ParprLN9hFX6cxJ1w7bv91PYx4o0J1zm?usp=sharing).
- **2021-10-13**
- **ktrain v0.28.x** is released and now includes the `AnswerExtractor`, which allows you to extract any information of interest from documents by simply phrasing it in the form of a question. A short example is shown here, but see the [example notebook](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/master/examples/text/qa_information_extraction.ipynb) for more information.
```python
Expand Down Expand Up @@ -101,7 +103,7 @@ Please see the following tutorial notebooks for a guide on how to use **ktrain**
* Tutorial A4: [Using Custom Data Formats and Models: Text Regression with Extra Regressors](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/master/tutorials/tutorial-A4-customdata-text_regression_with_extra_regressors.ipynb)


Some blog tutorials about **ktrain** are shown below:
Some blog tutorials and other guides about **ktrain** are shown below:

> [**ktrain: A Lightweight Wrapper for Keras to Help Train Neural Networks**](https://towardsdatascience.com/ktrain-a-lightweight-wrapper-for-keras-to-help-train-neural-networks-82851ba889c)
Expand All @@ -114,6 +116,8 @@ Some blog tutorials about **ktrain** are shown below:
> [**Finetuning BERT using ktrain for Disaster Tweets Classification**](https://medium.com/analytics-vidhya/finetuning-bert-using-ktrain-for-disaster-tweets-classification-18f64a50910b) by Hamiz Ahmed
> [**Indonesian NLP Examples with ktrain**](https://github.com/ilos-vigil/ktrain-assessment-study) by Sandy Khosasi



Expand Down Expand Up @@ -323,9 +327,12 @@ Using **ktrain** on **Google Colab**? See these Colab examples:

3. Install *ktrain*: `pip install ktrain`


The above should be all you need on Linux systems and cloud computing environments like Google Colab and AWS EC2. If you are using **ktrain** on a **Windows computer**, you can follow these
[more detailed instructions](https://github.com/amaiya/ktrain/blob/master/FAQ.md#how-do-i-install-ktrain-on-a-windows-machine) that include some extra steps.

**ktrain** should currently support any version of TensorFlow at or above to v2.3: i.e., `pip install tensorflow>=2.3`.

**Some important things to note about installation:**

Some optional, extra libraries used for some operations can be installed as needed:
Expand All @@ -349,7 +356,11 @@ pip install datasets
```
Notice that **ktrain** is using forked versions of the `eli5` and `stellargraph` libraries above in order to support TensorFlow2.


<!--
pip install pdoc3==0.9.2
pdoc3 --html -o docs ktrain
diff -qr docs/ktrain/ /path/to/repo/ktrain/docs
-->

### How to Cite

Expand Down
81 changes: 74 additions & 7 deletions docs/core.html
Original file line number Diff line number Diff line change
Expand Up @@ -686,14 +686,18 @@ <h1 class="title">Module <code>ktrain.core</code></h1>
```
plots training history
Args:
plot_type (str): one of {&#39;loss&#39;, &#39;lr&#39;, &#39;momentum&#39;}
plot_type (str): A valid value in tf.keras History. Either a built-in value {&#39;loss&#39;, &#39;lr&#39;, &#39;momentum&#39;} or
other values previously specified by user. For instance, if &#39;mae&#39; and/or &#39;mse&#39; is previously specified as metrics
when creating model, then these values can also be specified.
return_fig(bool): If True, return matplotlib.figure.Figure
Return:
matplotlib.figure.Figure if return_fig else None
```
&#34;&#34;&#34;
if self.history is None:
raise Exception(&#39;No training history - did you train the model yet?&#39;)
if not isinstance(plot_type, str):
raise ValueError(&#39;plot_type must be str/string&#39;)

fig = None
if plot_type == &#39;loss&#39;:
Expand Down Expand Up @@ -722,7 +726,22 @@ <h1 class="title">Module <code>ktrain.core</code></h1>
plt.ylabel(&#39;momentum&#39;)
plt.xlabel(&#39;iterations&#39;)
else:
raise ValueError(&#39;invalid type: choose loss, lr, or momentum&#39;)
if plot_type not in self.history.history:
raise ValueError(f&#39;no {plot_type} in history: are you sure {plot_type} exists in history?&#39;)
plt.plot(self.history.history[plot_type])

val_key = f&#39;val_{plot_type}&#39;
if val_key in self.history.history:
plt.plot(self.history.history[val_key])
legend_items = [&#39;train&#39;, &#39;validation&#39;]
else:
warnings.warn(f&#39;Validation value for {plot_type} wasn\&#39;t found in history&#39;)
legend_items = [&#39;train&#39;]

plt.title(f&#39;History of {plot_type}&#39;)
plt.ylabel(plot_type)
plt.xlabel(&#39;epoch&#39;)
plt.legend(legend_items, loc=&#39;upper left&#39;)
fig = plt.gcf()
plt.show()
if return_fig: return fig
Expand Down Expand Up @@ -1580,8 +1599,12 @@ <h1 class="title">Module <code>ktrain.core</code></h1>
preproc.datagen.preprocessing_function = pre_resnet50
elif preproc_name == &#39;mobilenet&#39;:
preproc.datagen.preprocessing_function = pre_mobilenet
elif preproc_name == &#39;mobilenetv3&#39;:
preproc.datagen.preprocessing_function = pre_mobilenetv3small
elif preproc_name == &#39;inception&#39;:
preproc.datagen.preprocessing_function = pre_inception
elif preproc_name == &#39;efficientnet&#39;:
preproc.datagen.preprocessing_function = pre_efficientnet
else:
raise Exception(&#39;Uknown preprocessing_function name: %s&#39; % (preproc_name))

Expand Down Expand Up @@ -1835,8 +1858,12 @@ <h2 class="section-title" id="header-functions">Functions</h2>
preproc.datagen.preprocessing_function = pre_resnet50
elif preproc_name == &#39;mobilenet&#39;:
preproc.datagen.preprocessing_function = pre_mobilenet
elif preproc_name == &#39;mobilenetv3&#39;:
preproc.datagen.preprocessing_function = pre_mobilenetv3small
elif preproc_name == &#39;inception&#39;:
preproc.datagen.preprocessing_function = pre_inception
elif preproc_name == &#39;efficientnet&#39;:
preproc.datagen.preprocessing_function = pre_efficientnet
else:
raise Exception(&#39;Uknown preprocessing_function name: %s&#39; % (preproc_name))

Expand Down Expand Up @@ -3492,14 +3519,18 @@ <h3>Inherited members</h3>
```
plots training history
Args:
plot_type (str): one of {&#39;loss&#39;, &#39;lr&#39;, &#39;momentum&#39;}
plot_type (str): A valid value in tf.keras History. Either a built-in value {&#39;loss&#39;, &#39;lr&#39;, &#39;momentum&#39;} or
other values previously specified by user. For instance, if &#39;mae&#39; and/or &#39;mse&#39; is previously specified as metrics
when creating model, then these values can also be specified.
return_fig(bool): If True, return matplotlib.figure.Figure
Return:
matplotlib.figure.Figure if return_fig else None
```
&#34;&#34;&#34;
if self.history is None:
raise Exception(&#39;No training history - did you train the model yet?&#39;)
if not isinstance(plot_type, str):
raise ValueError(&#39;plot_type must be str/string&#39;)

fig = None
if plot_type == &#39;loss&#39;:
Expand Down Expand Up @@ -3528,7 +3559,22 @@ <h3>Inherited members</h3>
plt.ylabel(&#39;momentum&#39;)
plt.xlabel(&#39;iterations&#39;)
else:
raise ValueError(&#39;invalid type: choose loss, lr, or momentum&#39;)
if plot_type not in self.history.history:
raise ValueError(f&#39;no {plot_type} in history: are you sure {plot_type} exists in history?&#39;)
plt.plot(self.history.history[plot_type])

val_key = f&#39;val_{plot_type}&#39;
if val_key in self.history.history:
plt.plot(self.history.history[val_key])
legend_items = [&#39;train&#39;, &#39;validation&#39;]
else:
warnings.warn(f&#39;Validation value for {plot_type} wasn\&#39;t found in history&#39;)
legend_items = [&#39;train&#39;]

plt.title(f&#39;History of {plot_type}&#39;)
plt.ylabel(plot_type)
plt.xlabel(&#39;epoch&#39;)
plt.legend(legend_items, loc=&#39;upper left&#39;)
fig = plt.gcf()
plt.show()
if return_fig: return fig
Expand Down Expand Up @@ -4712,7 +4758,9 @@ <h3>Methods</h3>
<dd>
<div class="desc"><pre><code>plots training history
Args:
plot_type (str): one of {'loss', 'lr', 'momentum'}
plot_type (str): A valid value in tf.keras History. Either a built-in value {'loss', 'lr', 'momentum'} or
other values previously specified by user. For instance, if 'mae' and/or 'mse' is previously specified as metrics
when creating model, then these values can also be specified.
return_fig(bool): If True, return matplotlib.figure.Figure
Return:
matplotlib.figure.Figure if return_fig else None
Expand All @@ -4726,14 +4774,18 @@ <h3>Methods</h3>
```
plots training history
Args:
plot_type (str): one of {&#39;loss&#39;, &#39;lr&#39;, &#39;momentum&#39;}
plot_type (str): A valid value in tf.keras History. Either a built-in value {&#39;loss&#39;, &#39;lr&#39;, &#39;momentum&#39;} or
other values previously specified by user. For instance, if &#39;mae&#39; and/or &#39;mse&#39; is previously specified as metrics
when creating model, then these values can also be specified.
return_fig(bool): If True, return matplotlib.figure.Figure
Return:
matplotlib.figure.Figure if return_fig else None
```
&#34;&#34;&#34;
if self.history is None:
raise Exception(&#39;No training history - did you train the model yet?&#39;)
if not isinstance(plot_type, str):
raise ValueError(&#39;plot_type must be str/string&#39;)

fig = None
if plot_type == &#39;loss&#39;:
Expand Down Expand Up @@ -4762,7 +4814,22 @@ <h3>Methods</h3>
plt.ylabel(&#39;momentum&#39;)
plt.xlabel(&#39;iterations&#39;)
else:
raise ValueError(&#39;invalid type: choose loss, lr, or momentum&#39;)
if plot_type not in self.history.history:
raise ValueError(f&#39;no {plot_type} in history: are you sure {plot_type} exists in history?&#39;)
plt.plot(self.history.history[plot_type])

val_key = f&#39;val_{plot_type}&#39;
if val_key in self.history.history:
plt.plot(self.history.history[val_key])
legend_items = [&#39;train&#39;, &#39;validation&#39;]
else:
warnings.warn(f&#39;Validation value for {plot_type} wasn\&#39;t found in history&#39;)
legend_items = [&#39;train&#39;]

plt.title(f&#39;History of {plot_type}&#39;)
plt.ylabel(plot_type)
plt.xlabel(&#39;epoch&#39;)
plt.legend(legend_items, loc=&#39;upper left&#39;)
fig = plt.gcf()
plt.show()
if return_fig: return fig
Expand Down
18 changes: 17 additions & 1 deletion docs/imports.html
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ <h1 class="title">Module <code>ktrain.imports</code></h1>
os.environ[&#39;NUMEXPR_MAX_THREADS&#39;] = &#39;8&#39; # suppress warning from NumExpr on machines with many CPUs

# TensorFlow
SUPPRESS_DEP_WARNINGS = strtobool(os.environ.get(&#39;SUPPRESS_DEP_WARNINGS&#39;, &#39;1&#39;))
if SUPPRESS_DEP_WARNINGS: # 2021-11-12: copied this here to properly suppress TF/CUDA warnings in Kaggle notebooks, etc.
os.environ[&#34;TF_CPP_MIN_LOG_LEVEL&#34;] = &#34;3&#34;
DISABLE_V2_BEHAVIOR = strtobool(os.environ.get(&#39;DISABLE_V2_BEHAVIOR&#39;, &#39;0&#39;))
if DISABLE_V2_BEHAVIOR:
# TF2-transition
Expand Down Expand Up @@ -142,9 +145,23 @@ <h1 class="title">Module <code>ktrain.imports</code></h1>
ResNet50 = keras.applications.ResNet50
MobileNet = keras.applications.mobilenet.MobileNet
InceptionV3 = keras.applications.inception_v3.InceptionV3
EfficientNetB1 = keras.applications.efficientnet.EfficientNetB1
EfficientNetB7 = keras.applications.efficientnet.EfficientNetB7
pre_resnet50 = keras.applications.resnet50.preprocess_input
pre_mobilenet = keras.applications.mobilenet.preprocess_input
pre_inception = keras.applications.inception_v3.preprocess_input
pre_efficientnet = keras.applications.efficientnet.preprocess_input

# for TF backwards compatibility (e.g., support for TF 2.3.x):
try:
MobileNetV3Small = keras.applications.MobileNetV3Small
pre_mobilenetv3small = keras.applications.mobilenet_v3.preprocess_input
HAS_MOBILENETV3 = True
except:
HAS_MOBILENETV3 = False





#----------------------------------------------------------
Expand Down Expand Up @@ -277,7 +294,6 @@ <h1 class="title">Module <code>ktrain.imports</code></h1>


# Suppress Warnings
SUPPRESS_DEP_WARNINGS = strtobool(os.environ.get(&#39;SUPPRESS_DEP_WARNINGS&#39;, &#39;1&#39;))
def set_global_logging_level(level=logging.ERROR, prefices=[&#34;&#34;]):
&#34;&#34;&#34;
Override logging levels of different modules based on their name as a prefix.
Expand Down
4 changes: 4 additions & 0 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -479,8 +479,12 @@ <h2 class="section-title" id="header-functions">Functions</h2>
preproc.datagen.preprocessing_function = pre_resnet50
elif preproc_name == &#39;mobilenet&#39;:
preproc.datagen.preprocessing_function = pre_mobilenet
elif preproc_name == &#39;mobilenetv3&#39;:
preproc.datagen.preprocessing_function = pre_mobilenetv3small
elif preproc_name == &#39;inception&#39;:
preproc.datagen.preprocessing_function = pre_inception
elif preproc_name == &#39;efficientnet&#39;:
preproc.datagen.preprocessing_function = pre_efficientnet
else:
raise Exception(&#39;Uknown preprocessing_function name: %s&#39; % (preproc_name))

Expand Down
4 changes: 4 additions & 0 deletions docs/text/preprocessor.html
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,8 @@ <h1 class="title">Module <code>ktrain.text.preprocessor</code></h1>
indices = []
for i in mb:
for doc in pb:
# https://stackoverflow.com/questions/67360987/bert-model-bug-encountered-during-training/67375675#67375675
doc = str(doc) if isinstance(doc, (float, int)) else doc
ids, segments = tokenizer.encode(doc, max_len=max_length)
indices.append(ids)
if verbose: mb.write(&#39;done.&#39;)
Expand Down Expand Up @@ -1582,6 +1584,8 @@ <h2 class="section-title" id="header-functions">Functions</h2>
indices = []
for i in mb:
for doc in pb:
# https://stackoverflow.com/questions/67360987/bert-model-bug-encountered-during-training/67375675#67375675
doc = str(doc) if isinstance(doc, (float, int)) else doc
ids, segments = tokenizer.encode(doc, max_len=max_length)
indices.append(ids)
if verbose: mb.write(&#39;done.&#39;)
Expand Down

0 comments on commit 3de019a

Please sign in to comment.