Skip to content
This repository has been archived by the owner on Aug 31, 2021. It is now read-only.

Hands-one assistance in Embedding and logisitic regression over aggregated data #136

Closed
borisRa opened this issue Mar 9, 2016 · 3 comments

Comments

@borisRa
Copy link

borisRa commented Mar 9, 2016

Hi,

I need assistance in three issues :

  1. How to I apply the embedding only on the categorical features (I have also continuous )?
  2. How do address the following issue with Skflow : [http://stackoverflow.com/questions/33871615/train-a-model-with-probability-response-or-number-of-successes-failures-rather]
  3. How do I add the probability estimation for a success in the logistic output ?

Thanks,
Boris

@ilblackdragon
Copy link
Contributor

  1. Currently it's not very convenient - I'm working on API making it better.
    To do it - you need to pass everything as continuous matrix and then split it.
    e.g.
def my_model(X, y):
    # X - is [batch_size, n_features], where features split into n_cat + n_cont
    Xcat = tf.cast(tf.slice(X, [0, 0], [X.get_shape()[0], n_cat]), np.int64)
    Xcont = tf.slice(X, [0, n_cat], X.get_shape())

This way Xcat can be passed into categorical_variable and then combined with continues features.

Stay tuned for a better way to do it!

  1. @terrytangyuan responded on stackoverflow.
  2. Do you mean how to get probability out of the estimator for logistic output? You can just run estimator.predict_proba which will return probabilities per class instead of predicted class.

Let me know if this responds your questions!

@borisRa
Copy link
Author

borisRa commented Mar 9, 2016

Thanks for the quick response !

  1. About the first one : how to combine (column bind for tf object) Xcat & Xcont back to X.
    To apply the deep learning models on X?
  2. I meant how to input aggregated data into the logistic regression.Instead of '1' for success and '0' for failure. input the Y attribute total number of successes and failures per aggregation level. For now there is no such support in Scikit => (Logistic regression with a probability response or with number of successes/failures) rather than binary outputs scikit-learn/scikit-learn#6496 (comment))

Is there a solution for this problem in Skflow ?

Thanks again !
Boris

@ilblackdragon
Copy link
Contributor

FeatureColumns are the way to do this now. Please use recent version for Tensorflow to do this.
Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants