Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I initial an ActiveLearner with 3-D shape data? #72

Open
rz-zhang opened this issue Feb 28, 2020 · 5 comments
Open

How can I initial an ActiveLearner with 3-D shape data? #72

rz-zhang opened this issue Feb 28, 2020 · 5 comments

Comments

@rz-zhang
Copy link

rz-zhang commented Feb 28, 2020

from modAL.models import ActiveLearner

learner = ActiveLearner(
    estimator=classifier,
    X_training=X_initial, y_training=y_initial,
    verbose=1
)

Here I am trying to do the Named-Entity Recognition task, so the shape of y_initial is (1000, 75, 17),
where 1000 is the number of sentences, 75 is the number of words in one sentence, and 17 is the number of tags for a word. Then I get such an error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-63-72e48629dd01> in <module>()
      5     estimator=classifier,
      6     X_training=X_initial, y_training=y_initial,
----> 7     verbose=1
      8 )

3 frames
/usr/local/lib/python3.6/dist-packages/modAL/models/learners.py in __init__(self, estimator, query_strategy, X_training, y_training, bootstrap_init, **fit_kwargs)
     77                  ) -> None:
     78         super().__init__(estimator, query_strategy,
---> 79                          X_training, y_training, bootstrap_init, **fit_kwargs)
     80 
     81     def teach(self, X: modALinput, y: modALinput, bootstrap: bool = False, only_new: bool = False, **fit_kwargs) -> None:

/usr/local/lib/python3.6/dist-packages/modAL/models/base.py in __init__(self, estimator, query_strategy, X_training, y_training, bootstrap_init, force_all_finite, **fit_kwargs)
     61         self.y_training = y_training
     62         if X_training is not None:
---> 63             self._fit_to_known(bootstrap=bootstrap_init, **fit_kwargs)
     64 
     65         assert isinstance(force_all_finite, bool), 'force_all_finite must be a bool'

/usr/local/lib/python3.6/dist-packages/modAL/models/base.py in _fit_to_known(self, bootstrap, **fit_kwargs)
    104         """
    105         if not bootstrap:
--> 106             self.estimator.fit(self.X_training, self.y_training, **fit_kwargs)
    107         else:
    108             n_instances = self.X_training.shape[0]

/usr/local/lib/python3.6/dist-packages/keras/wrappers/scikit_learn.py in fit(self, x, y, sample_weight, **kwargs)
    204             y = np.searchsorted(self.classes_, y)
    205         else:
--> 206             raise ValueError('Invalid shape for y: ' + str(y.shape))
    207         self.n_classes_ = len(self.classes_)
    208         if sample_weight is not None:

ValueError: Invalid shape for y: (1000, 75, 17)
@cosmic-cortex
Copy link
Member

You get the error during the initialization? From the error details, it seems like you have the error after you first try to call learner.fit() or learner.teach(), because it refers to y, which has an invalid shape, it should be (1000, 50, 15) like y_initial, instead of (1000, 75, 17) which it actually has.

@rz-zhang
Copy link
Author

You get the error during the initialization? From the error details, it seems like you have the error after you first try to call learner.fit() or learner.teach(), because it refers to y, which has an invalid shape, it should be (1000, 50, 15) like y_initial, instead of (1000, 75, 17) which it actually has.

Sorry for the confusion. Actually, I keep the consistency in my code. Both the shapes are (1000, 75, 17). I have edited the description of my first comment.
To clarify, the problem is whether the ActiveLearner class only supports 2-D data as input?

@cosmic-cortex
Copy link
Member

Thanks for the clarification! Which Keras version you are running? (You can find this by for example pip freeze | grep keras.) It seems to me that the problem is actually with Keras, you may be using an older version where only 2D data is supported.

modAL supports arbitrary dimensions, so it is either a bug or an issue with your Keras version.

@rz-zhang
Copy link
Author

rz-zhang commented Feb 28, 2020

Thanks for the clarification! Which Keras version you are running? (You can find this by for example pip freeze | grep keras.) It seems to me that the problem is actually with Keras, you may be using an older version where only 2D data is supported.

modAL supports arbitrary dimensions, so it is either a bug or an issue with your Keras version.

The versions I'm using are:
tf.version = 1.15.0
keras.version = 2.2.4

According to the traceback, I checked the source code of
python3.6/dist-packages/keras/wrappers/scikit_learn.py
and I found that there is a restrict limit to the data shape as you can see below.
So which keras version do you recommend?

class KerasClassifier(BaseWrapper):
"""Implementation of the scikit-learn classifier API for Keras.
    """
    def fit(self, x, y, sample_weight=None, **kwargs):
        """Constructs a new model with `build_fn` & fit the model to `(x, y)`.

        # Arguments
            x : array-like, shape `(n_samples, n_features)`
                Training samples where `n_samples` is the number of samples
                and `n_features` is the number of features.
            y : array-like, shape `(n_samples,)` or `(n_samples, n_outputs)`
                True labels for `x`.
            **kwargs: dictionary arguments
                Legal arguments are the arguments of `Sequential.fit`

        # Returns
            history : object
                details about the training history at each epoch.

        # Raises
            ValueError: In case of invalid shape for `y` argument.
        """
        y = np.array(y)
        if len(y.shape) == 2 and y.shape[1] > 1:
            self.classes_ = np.arange(y.shape[1])
        elif (len(y.shape) == 2 and y.shape[1] == 1) or len(y.shape) == 1:
            self.classes_ = np.unique(y)
            y = np.searchsorted(self.classes_, y)
        else:
            raise ValueError('Invalid shape for y: ' + str(y.shape))
        self.n_classes_ = len(self.classes_)
        if sample_weight is not None:
            kwargs['sample_weight'] = sample_weight
        return super(KerasClassifier, self).fit(x, y, **kwargs)

@cosmic-cortex
Copy link
Member

Sorry for the late answer, I was extremely busy. I was checking out an earlier version of Keras which didn't contain this restriction, so I would suggest the most recent one where these changes are not introduced. It seems to me that 2.0.1 would work: https://github.com/keras-team/keras/blob/2.0.1/keras/wrappers/scikit_learn.py

However, this is a quite old version, so it might not be best to use this. What I would recommend in that case is the following. First, you should try to reproduce this issue without using modAL to make sure it is an issue with Keras. If you can confirm this, the next step should be to open an issue in the Keras repository detailing the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants