Skip to content

Commit

Permalink
Merge branch 'develop'
Browse files Browse the repository at this point in the history
  • Loading branch information
amaiya committed Jul 7, 2020
2 parents c5b9311 + 563ce27 commit 301ec1b
Show file tree
Hide file tree
Showing 15 changed files with 470 additions and 209 deletions.
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,27 @@ Most recent releases are shown at the top. Each release shows:
- **Changed**: Additional parameters, changes to inputs or outputs, etc
- **Fixed**: Bug fixes that don't change documented behaviour

## 0.18.0 (2020-07-07)

### New:
- N/A

### Changed
- Fixes to address changes or issues in TensorFlow 2.2.0:
- created `metrics_from_model` function due to changes in the way metrics are extracted from compiled model
- use `loss_fn_from_model` function due to changes in they way loss functions are extracted from compiled model
- addd `**kwargs` to `AdamWeightDecay based on [this issue](https://github.com/tensorflow/addons/issues/1645)
- changed `TransformerTextClassLearner.predict` and `TextPredictor.predict` to deal with tuples being returned by `predict` in TensorFlow 2.2.0
- changed multilabel test to use loss insead of accuracy due to [TF 2.2.0 issue](https://github.com/tensorflow/tensorflow/issues/41114)
- changed `Learner.lr_find` to use `save_model` and `load_model` to restore weights due to [this TF issue](https://github.com/tensorflow/tensorflow/issues/41116)
and added `TransformersPreprocessor.load_model_and_configure_from_data` to support this

### Fixed:
- N/A




## 0.17.5 (2020-07-02)

### New:
Expand Down
8 changes: 8 additions & 0 deletions FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@

- [How do I get the predicted class "probabilities" of a model?](#how-do-i-get-the-predicted-class-probabilities-of-a-model)

- [How do I handle imbalanced datasets?](#how-do-i-handle-imbalanced-datasets)

- [How do I retrieve or visualize training history?](#how-do-i-retrieve-or-visualize-training-history)

- [I have a model that accepts multiple inputs (e.g., both text and other numerical or categorical variables). How do I train it with *ktrain*?](#i-have-a-model-that-accepts-multiple-inputs-eg-both-text-and-other-numerical-or-categorical-variables--how-do-i-train-it-with-ktrain)
Expand Down Expand Up @@ -352,6 +354,12 @@ learner.autofit(0.005, 2, callbacks=[RocAuc])
All `predict` methods in `Predictor` instances accept a `return_proba` argument. Set it to true to obtain the class probabilities.


### How do I handle imbalanced datasets?

All `*fit*` methods (e.g., `learner.fit`, `learner.autofit`, `learner.fit_onecycle`) accept a `class_weight` parameter, which is passed
to the `model.fit` method in `tf.Keras`. See [this StackOverflow post](https://stackoverflow.com/questions/44716150/how-can-i-assign-a-class-weight-in-keras-in-a-simple-way) for more details.


[[Back to Top](#frequently-asked-questions-about-ktrain)]


Expand Down
11 changes: 7 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@


### News and Announcements
- **2020-07-07:**
- ***ktrain*** **v0.18.x is released** and now supports TensorFlow 2.2.0.
- **2020-06-26:**
- ***ktrain*** **v0.17.x is released** and includes support for **language translation**. See the [example language translation notebook](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/develop/examples/text/language_translation_example.ipynb) for more information. <sub><sup>(This feature currently requires that PyTorch be installed.)</sup></sub>
```python
Expand Down Expand Up @@ -282,15 +284,16 @@ Using *ktrain* on **Google Colab**? See these Colab examples:

### Installation

*ktrain* currently uses [TensorFlow 2.1.0](https://www.tensorflow.org/install/pip?lang=python3), which will be installed automatically when installing *ktrain*.
While *ktrain* will probably work with other versions of TensorFlow 2.x, v2.1.0 is the current recommended and tested version.
*ktrain* currently uses [TensorFlow 2.2.0](https://www.tensorflow.org/install/pip?lang=python3), which will be installed automatically when installing *ktrain*.
While *ktrain* will probably work with other versions of TensorFlow 2.x, v2.1.0 and v2.2.0 are the current recommended and tested versions.

1. Make sure pip is up-to-date with: `pip3 install -U pip`

2. Install *ktrain*: `pip3 install ktrain`

**Some things to note:**
- *ktrain* will automatically install TensorFlow 2.1.0 as a dependency. Since TensorFlow 2.1.0 does not support Python 3.8, running `pip install ktrain` on a system with Python 3.8 as default (e.g., Ubuntu 20.04) will currently not work, as TensorFlow 2.1.0 will not be able to be downloaded. *ktrain* will support TensorFlow 2.2.0 (and Python 3.8) in the near future. In the meantime, if you are on a system with Python 3.8 as default (like Ubuntu 20.04 LTS), you will need to install Python 3.7 and use that version of Python with *ktrain*.
- *ktrain* will automatically install TensorFlow 2.2.0 as a dependency if TensorFlow is not already installed on your system.
If TensorFlow 2.1.0 is already installed, *ktrain* will not install TensorFlow 2.2.0 and use TensorFlow 2.1.0 instead.
- Since some *ktrain* dependencies have not yet been migrated to `tf.keras` in TensorFlow 2 (or may have other issues),
*ktrain* is temporarily using forked versions of some libraries. Specifically, *ktrain* uses forked versions of the `eli5` and `stellargraph` libraries. If not installed, *ktrain* will complain when a method or function needing
either of these libraries is invoked.
Expand All @@ -300,7 +303,7 @@ pip3 install git+https://github.com/amaiya/eli5@tfkeras_0_10_1
pip3 install git+https://github.com/amaiya/stellargraph@no_tf_dep_082
```

This code was tested on Ubuntu 18.04 LTS using TensorFlow 2.1.0
This code was tested on Ubuntu 18.04 LTS using TensorFlow 2.2.0 and Python 3.6.9.


### How to Cite
Expand Down
5 changes: 5 additions & 0 deletions ktrain/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,3 +116,8 @@ def get_learner(model, train_data=None, val_data=None,
return learner(model, train_data=train_data, val_data=val_data,
batch_size=batch_size, eval_batch_size=eval_batch_size,
workers=workers, use_multiprocessing=use_multiprocessing, multigpu=multigpu)


# keys
# currently_unsupported: unsupported or disabled features (e.g., xai graph neural networks have not been implemented)
# dep_fix: a fix to address a problem in a dependency
45 changes: 31 additions & 14 deletions ktrain/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -256,9 +256,16 @@ def save_model(self, fpath):
return


def load_model(self, fpath, custom_objects=None):
def load_model(self, fpath, custom_objects=None, **kwargs):
"""
a wrapper to load_model
loads model from file path to folder.
Note: **kwargs included for backwards compatibility only, as TransformerTextClassLearner.load_model was removed in v0.18.0.
Args:
fpath(str): path to folder containing model
custom_objects(dict): custom objects required to load model.
For models included with ktrain, this is populated automatically
and can be disregarded.
"""
self.model = _load_model(fpath, train_data=self.train_data, custom_objects=custom_objects)
return
Expand Down Expand Up @@ -288,13 +295,15 @@ def _recompile(self, wd=None):
#self.model.compile(optimizer=self.model.optimizer,
#loss=self.model.loss,
#metrics=metrics)
metrics = [m.name for m in self.model.metrics] if U.is_tf_keras() else self.model.metrics
if wd is not None and type(self.model.optimizer).__name__ != 'AdamWeightDecay':
metrics = U.metrics_from_model(self.model)
if wd is not None and wd > 0 and type(self.model.optimizer).__name__ != 'AdamWeightDecay':
warnings.warn('recompiling model to use AdamWeightDecay as opimizer with weight decay of %s' % (wd) )
optimizer = U.get_default_optimizer(wd=wd)
elif wd is not None:
elif wd is not None and wd > 0:
optimizer = U.get_default_optimizer(wd=wd)
else:
elif wd is not None and wd == 0:
optimizer = U.DEFAULT_OPT
else: # wd is None -> don't modify optimizer
optimizer = self.model.optimizer
self.model.compile(optimizer=optimizer,
loss=self.model.loss,
Expand Down Expand Up @@ -444,9 +453,11 @@ def lr_find(self, start_lr=1e-7, lr_mult=1.01, max_epochs=None,
verbose=verbose)

# save current weights and temporarily restore original weights
new_file, weightfile = tempfile.mkstemp()
self.model.save_weights(weightfile)
#self.model.load_weights(self._original_weights)
# dep_fix: temporarily use save_model instead of save_weights due to https://github.com/tensorflow/tensorflow/issues/41116
#new_file, weightfile = tempfile.mkstemp()
#self.model.save_weights(weightfile)
temp_folder = tempfile.mkdtemp()
self.save_model(temp_folder)


# compute steps_per_epoch
Expand Down Expand Up @@ -480,11 +491,14 @@ def lr_find(self, start_lr=1e-7, lr_mult=1.01, max_epochs=None,
verbose=verbose)
except KeyboardInterrupt:
# re-load current weights
self.model.load_weights(weightfile)
#self.model.load_weights(weightfile)
self.load_model(temp_folder)
return

# re-load current weights
self.model.load_weights(weightfile)
# 2020-0707: temporarily use load_model instead of load_weights due to https://github.com/tensorflow/tensorflow/issues/41116
#self.model.load_weights(weightfile)
self.load_model(temp_folder)

# instructions to invoker
U.vprint('\n', verbose=verbose)
Expand Down Expand Up @@ -1426,9 +1440,12 @@ def release_gpu_memory(device=0):
def _load_model(fpath, preproc=None, train_data=None, custom_objects=None):
if not preproc and not train_data:
raise ValueError('Either preproc or train_data is required.')
if preproc and isinstance(preproc, TransformersPreprocessor):
# note: with transformer models, fname is actually a directory
model = preproc.get_model(fpath=fpath)
if (preproc and isinstance(preproc, TransformersPreprocessor)) or \
(train_data and U.is_huggingface(data=train_data)):
if preproc:
model = preproc.get_model(fpath=fpath)
else:
model = TransformersPreprocessor.load_model_and_configure_from_data(fpath, train_data)
return model
elif (preproc and (isinstance(preproc, BERTPreprocessor) or \
type(preproc).__name__ == 'BERTPreprocessor')) or\
Expand Down

0 comments on commit 301ec1b

Please sign in to comment.