Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logistic #35

Closed
yyyooohao opened this issue Aug 25, 2021 · 29 comments
Closed

logistic #35

yyyooohao opened this issue Aug 25, 2021 · 29 comments
Labels
question Further information is requested

Comments

@yyyooohao
Copy link

The logistic regression that I use, the linear model that I use, it says in the document that logistic regression is included, why does it show up when I predict PMML model does not contain RegressionModel.

@iamDecode
Copy link
Owner

Thanks for your interest in sklearn-pmml-model! In order for me to help you find the problem, it would be great if you can stick to the issue template. Without an extract of the PMML model you are trying to convert, it is difficult for me to help you.

That being said, based on the error I think the library you used to export the PMML model has converted the original RegressionModel to an equivalent GeneralRegressionModel. If this is the case, you should be able to generate predictions using PMMLRidgeClassifier for classification, or PMMLRidge for regression.

@iamDecode iamDecode added the question Further information is requested label Aug 25, 2021
@yyyooohao
Copy link
Author

Thanks for your interest in sklearn-pmml-model! In order for me to help you find the problem, it would be great if you can stick to the issue template. Without an extract of the PMML model you are trying to convert, it is difficult for me to help you.

That being said, based on the error I think the library you used to export the PMML model has converted the original RegressionModel to an equivalent GeneralRegressionModel. If this is the case, you should be able to generate predictions using PMMLRidgeClassifier for classification, or PMMLRidge for regression.

I used SVM to predict before, and I want to use logistic regression to classify, test the accuracy of the results, and use logistic regression prediction under the linear model. I mainly want to try logistic regression for classification.

@yyyooohao yyyooohao reopened this Aug 25, 2021
@yyyooohao
Copy link
Author

Thanks for your interest in sklearn-pmml-model! In order for me to help you find the problem, it would be great if you can stick to the issue template. Without an extract of the PMML model you are trying to convert, it is difficult for me to help you.

That being said, based on the error I think the library you used to export the PMML model has converted the original RegressionModel to an equivalent GeneralRegressionModel. If this is the case, you should be able to generate predictions using PMMLRidgeClassifier for classification, or PMMLRidge for regression.

Well, I can use THE SVM export to PMML to make predictions, but the logical classification prediction will report an error

@iamDecode
Copy link
Owner

I suppose you are using PMMLLogisticRegression to make 'logical classification' predictions? In my previous comment, I recommended to use PMMLRidgeClassifier instead. To do that, just replace "PMMLLogisticRegression" with "PMMLRidgeClassifier". I think that should work for you.

@yyyooohao
Copy link
Author

I suppose you are using PMMLLogisticRegression to make 'logical classification' predictions? In my previous comment, I recommended to use PMMLRidgeClassifier instead. To do that, just replace "PMMLLogisticRegression" with "PMMLRidgeClassifier". I think that should work for you.

Should I change my training to RidgeClassifier, or is there a problem with data processing? SVM can be a good test,Exception: PMML model does not contain GeneralRegressionModel.

@yyyooohao
Copy link
Author

I suppose you are using PMMLLogisticRegression to make 'logical classification' predictions? In my previous comment, I recommended to use PMMLRidgeClassifier instead. To do that, just replace "PMMLLogisticRegression" with "PMMLRidgeClassifier". I think that should work for you.

Why is it easier for me to predict with SVM, but harder for me to predict with logistic regression? Is there any other model that can do better classification

@yyyooohao
Copy link
Author

I suppose you are using PMMLLogisticRegression to make 'logical classification' predictions? In my previous comment, I recommended to use PMMLRidgeClassifier instead. To do that, just replace "PMMLLogisticRegression" with "PMMLRidgeClassifier". I think that should work for you.

image
If only classfier parameters can be predicted in PMMLPipeline, but the accuracy of the result is not high, the logistic regression parameters need to be adjusted to reach a certain precision value.

@iamDecode
Copy link
Owner

I am not entirely sure what your problem is. It would be helpful if you can provide a copy of the PMML file that you having problems with.

In your screenshot you show the method PMMLPipeline. Do note this method is not part of this library, but from sklearn2pmml instead. That library converts sklearn models into PMML, as opposed to sklearn-pmml-model creating a sklearn model from a PMML.

For me, PMMLLogisticRegression works just fine. Check out this simple example on how to use it along with sklearn2pmml:

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
from sklearn_pmml_model.linear_model import PMMLLogisticRegression
from sklearn2pmml.pipeline import PMMLPipeline
from sklearn2pmml import sklearn2pmml

# Prepare data
iris = load_iris()
X = pd.DataFrame(iris.data)
X.columns = np.array(iris.feature_names)
y = pd.Series(np.array(iris.target_names)[iris.target])
y.name = "Class"

# train logistic regression
clf = LogisticRegression()
pipeline = PMMLPipeline([
    ("classifier", clf)
])
pipeline.fit(X, y)

# convert to PMML
sklearn2pmml(pipeline, "test.pmml", with_repr = True)

# Load from PMML and predict
clf = PMMLLogisticRegression(pmml="test.pmml")
clf.predict(X)
clf.score(X, y)

@yyyooohao
Copy link
Author

I am not entirely sure what your problem is. It would be helpful if you can provide a copy of the PMML file that you having problems with.

In your screenshot you show the method PMMLPipeline. Do note this method is not part of this library, but from sklearn2pmml instead. That library converts sklearn models into PMML, as opposed to sklearn-pmml-model creating a sklearn model from a PMML.

For me, PMMLLogisticRegression works just fine. Check out this simple example on how to use it along with sklearn2pmml:

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
from sklearn_pmml_model.linear_model import PMMLLogisticRegression
from sklearn2pmml.pipeline import PMMLPipeline
from sklearn2pmml import sklearn2pmml

# Prepare data
iris = load_iris()
X = pd.DataFrame(iris.data)
X.columns = np.array(iris.feature_names)
y = pd.Series(np.array(iris.target_names)[iris.target])
y.name = "Class"

# train logistic regression
clf = LogisticRegression()
pipeline = PMMLPipeline([
    ("classifier", clf)
])
pipeline.fit(X, y)

# convert to PMML
sklearn2pmml(pipeline, "test.pmml", with_repr = True)

# Load from PMML and predict
clf = PMMLLogisticRegression(pmml="test.pmml")
clf.predict(X)
clf.score(X, y)

image
Logistic regression can be used, but it's not very accurate, only 40% accurate.Are there other networks that do categorization?

@iamDecode
Copy link
Owner

The parameters you show don't make a lot of sense to me. max_iter = 2 is way too low to yield any decent classification. I suggest you start with LogisticRegression(), so without any arguments. See if that works (it should), and then gradually add arguments to see if it improves performance. Often enough, the default parameters prove to be sufficient.

If you like to try another model, I suggest trying RandomForestClassifier.

@yyyooohao
Copy link
Author

The parameters you show don't make a lot of sense to me. max_iter = 2 is way too low to yield any decent classification. I suggest you start with LogisticRegression(), so without any arguments. See if that works (it should), and then gradually add arguments to see if it improves performance. Often enough, the default parameters prove to be sufficient.

If you like to try another model, I suggest trying RandomForestClassifier.

The test accuracy of default parameters is not high, which can only reach half of SVM, and it needs to be adjusted, and it does not need too complex network model.

@yyyooohao
Copy link
Author

The parameters you show don't make a lot of sense to me. max_iter = 2 is way too low to yield any decent classification. I suggest you start with LogisticRegression(), so without any arguments. See if that works (it should), and then gradually add arguments to see if it improves performance. Often enough, the default parameters prove to be sufficient.

If you like to try another model, I suggest trying RandomForestClassifier.

I tried the random forest,ModuleNotFoundError: No module named 'sklearn_pmml_model.tree._tree'.I use three categories

@iamDecode
Copy link
Owner

iamDecode commented Aug 30, 2021

Please make sure you installed the library using pip install sklearn-pmml-model. This error seems to indicate the Cython code is not compiled, which is only the case if you downloaded this library and are working in that directory directly.

If you, for some reason, cannot use pip, running the following command will compile the Cython code inplace, and should fix the issue you have:

python setup.py build_ext --inplace

I don't recommend this, and it will require a C compiler, which is a bit of a pain to setup on windows. More information about this process can be found at https://sklearn-pmml-model.readthedocs.io/en/latest/install.html#from-source.

@yyyooohao
Copy link
Author

Please make sure you installed the library using pip install sklearn-pmml-model. This error seems to indicate the Cython code is not compiled, which is only the case if you downloaded this library and are working in that directory directly.

If you, for some reason, cannot use pip, running the following command will compile the Cython code inplace, and should fix the issue you have:

python setup.py build_ext --inplace

I don't recommend this, and it will require a C compiler, which is a bit of a pain to setup on windows. More information about this process can be found at https://sklearn-pmml-model.readthedocs.io/en/latest/install.html#from-source.

I installed the package according to Requerment.txt

@yyyooohao
Copy link
Author

Please make sure you installed the library using pip install sklearn-pmml-model. This error seems to indicate the Cython code is not compiled, which is only the case if you downloaded this library and are working in that directory directly.

If you, for some reason, cannot use pip, running the following command will compile the Cython code inplace, and should fix the issue you have:

python setup.py build_ext --inplace

I don't recommend this, and it will require a C compiler, which is a bit of a pain to setup on windows. More information about this process can be found at https://sklearn-pmml-model.readthedocs.io/en/latest/install.html#from-source.

If I use logistic regression to do the tripartite model can't it predict

@yyyooohao
Copy link
Author

which is only the case if you downloaded this library and are working in that directory directly.

I can use PIP, how can I simply use random forest, I don't want to install c compiler.

@yyyooohao
Copy link
Author

Please make sure you installed the library using pip install sklearn-pmml-model. This error seems to indicate the Cython code is not compiled, which is only the case if you downloaded this library and are working in that directory directly.

If you, for some reason, cannot use pip, running the following command will compile the Cython code inplace, and should fix the issue you have:

python setup.py build_ext --inplace

I don't recommend this, and it will require a C compiler, which is a bit of a pain to setup on windows. More information about this process can be found at https://sklearn-pmml-model.readthedocs.io/en/latest/install.html#from-source.
Why do I use logistic regression to do the binary classification of such errors, the first two days can also do three classifications will report errors
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 1024)

@iamDecode
Copy link
Owner

If you use pip to install the library, no C compiler is necessary. More information on how to install using pip can be found in the documentation: https://sklearn-pmml-model.readthedocs.io/en/latest/install.html#pip.

pip is the standard package manager for Python, and is included with every Python install. The documentation includes a link to more general information about pip here: https://packaging.python.org/tutorials/installing-packages/#use-pip-for-installing.

@yyyooohao
Copy link
Author

If you use pip to install the library, no C compiler is necessary. More information on how to install using pip can be found in the documentation: https://sklearn-pmml-model.readthedocs.io/en/latest/install.html#pip.

pip is the standard package manager for Python, and is included with every Python install. The documentation includes a link to more general information about pip here: https://packaging.python.org/tutorials/installing-packages/#use-pip-for-installing.

I installed packages from Requiest with PIP. Why do I get errors with those models

@iamDecode
Copy link
Owner

Why do I get errors with those models

You have to let me know which errors you are seeing, otherwise I cannot help you.


I am expecting you still installed the packages with pip but are still within a clone of this package. If you are working in a copy of this repository, please remove it, start fresh, do a pip install, and try out the example I provided here: #35 (comment). If this works, you can proceed to try different models and datasets.

@yyyooohao
Copy link
Author

Why do I get errors with those models

You have to let me know which errors you are seeing, otherwise I cannot help you.

I am expecting you still installed the packages with pip but are still within a clone of this package. If you are working in a copy of this repository, please remove it, start fresh, do a pip install, and try out the example I provided here: #35 (comment). If this works, you can proceed to try different models and datasets.

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 1024)
image
The error occurred when I used logistic regression or ridge regression, it is ok to carry out binary classification before logistic regression, can triple classification be used? I mainly use it to test binary classification and triple classification. If it is triple classification, do I need to make any modifications。

@yyyooohao
Copy link
Author

Why do I get errors with those models

You have to let me know which errors you are seeing, otherwise I cannot help you.

I am expecting you still installed the packages with pip but are still within a clone of this package. If you are working in a copy of this repository, please remove it, start fresh, do a pip install, and try out the example I provided here: #35 (comment). If this works, you can proceed to try different models and datasets.

Well, use the package version, but don't use it directly in your project.

@yyyooohao
Copy link
Author

Why do I get errors with those models

You have to let me know which errors you are seeing, otherwise I cannot help you.

I am expecting you still installed the packages with pip but are still within a clone of this package. If you are working in a copy of this repository, please remove it, start fresh, do a pip install, and try out the example I provided here: #35 (comment). If this works, you can proceed to try different models and datasets.

image
I used logistic to classify them into three categories and found Exception: PMML model does not contain RegressionModel. Reinstalled the package, the dichotomies can be predicted, ridge regression is also such a problem.

@yyyooohao
Copy link
Author

Why do I get errors with those models

You have to let me know which errors you are seeing, otherwise I cannot help you.

I am expecting you still installed the packages with pip but are still within a clone of this package. If you are working in a copy of this repository, please remove it, start fresh, do a pip install, and try out the example I provided here: #35 (comment). If this works, you can proceed to try different models and datasets.

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject,
Do I need to do some configuration when I use GBDT classification.

@iamDecode
Copy link
Owner

I used logistic to classify them into three categories and found Exception: PMML model does not contain RegressionModel. Reinstalled the package, the dichotomies can be predicted, ridge regression is also such a problem.

Ok I think I understand now. You seem to be using the multi_class='ovr' parameter on your LogisticRegression class (from #35 (comment)). This means one-versus-rest regression. This type is not explicitly supported by the library yet, but I am working on adding it right now.

To get it working in the mean time, you can use the default parameter multi_class='auto' or specifically select multi_class='multinomial' instead. This type of regression should work!

@yyyooohao
Copy link
Author

I used logistic to classify them into three categories and found Exception: PMML model does not contain RegressionModel. Reinstalled the package, the dichotomies can be predicted, ridge regression is also such a problem.

Ok I think I understand now. You seem to be using the multi_class='ovr' parameter on your LogisticRegression class (from #35 (comment)). This means one-versus-rest regression. This type is not explicitly supported by the library yet, but I am working on adding it right now.

To get it working in the mean time, you can use the default parameter multi_class='auto' or specifically select multi_class='multinomial' instead. This type of regression should work!

image
Well, I had a logistic triage error,Exception: PMML model does not contain RegressionModel.

@yyyooohao
Copy link
Author

I used logistic to classify them into three categories and found Exception: PMML model does not contain RegressionModel. Reinstalled the package, the dichotomies can be predicted, ridge regression is also such a problem.

Ok I think I understand now. You seem to be using the multi_class='ovr' parameter on your LogisticRegression class (from #35 (comment)). This means one-versus-rest regression. This type is not explicitly supported by the library yet, but I am working on adding it right now.

To get it working in the mean time, you can use the default parameter multi_class='auto' or specifically select multi_class='multinomial' instead. This type of regression should work!

image
soga,Three categories running, ha ha

@iamDecode
Copy link
Owner

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject,

This error typically means you have to re-install numpy (pip install numpy --upgrade)

@iamDecode
Copy link
Owner

soga,Three categories running, ha ha

Glad you got it working! I have just released a new version that should also work with multi_class='ovr'. If your initial problem is resolved, can I close this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants