Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why not merge conditions in only one vector? #3

Closed
haocheng6 opened this issue Jul 8, 2019 · 7 comments
Closed

Why not merge conditions in only one vector? #3

haocheng6 opened this issue Jul 8, 2019 · 7 comments

Comments

@haocheng6
Copy link

haocheng6 commented Jul 8, 2019

This is an inspiring project. Thanks.

I am relatively new to neural network. I do not see why you use separate dense layers for different conditions. It seems to me that merging condition vectors into one vector and feed the vector into a dense layer can achieve the same result.

Take predicting air quality for example. If we have two features that are not time series, say city and number of vehicles, the first feature will be converted to one-hot encoding, and the second feature will be directly represented as a number. Can I append the number of vehicles to the city vector and feed the new vector to a dense layer? That seems natural to me.

What are your concerns when you use different dense layers for different conditions?

@philipperemy
Copy link
Owner

philipperemy commented Jul 10, 2019

@BiggerHao yeah that's a good point!

You are totally correct here.

@philipperemy
Copy link
Owner

The only thing that would differ is the bias.

@haocheng6
Copy link
Author

@philipperemy I get it. Thank you.

@shivam13juna
Copy link

Hold on @BiggerHao , City is one-hot encoded right, that's a completely different method of representing data, and appending number of vehicles (which is an again a different way of representing numbers) will make the model's work of understanding data pattern extra hard.

Further, if we expect model to learn these differences then why bother merging non-sequential data with sequential, LSTM can at the end of the day learn to differentiate between these sequential and non-sequential data.

Lastly, i'm not trying to prove anyone wrong, I'm very curious and it is just a point of view. Have a good day, Thanks!!!

@haocheng6
Copy link
Author

@shivam13juna I think using both continuous and one-hot categorical features simultaneously in one neural layer is a common practice which is not harder than using those two types features in two seperate layers and then merging the two layers.

However, I am still not an expert of deep learning, so my view may be biased.

@shivam13juna
Copy link

@BiggerHao you said it's a common practice, any popular tensorflow or pytorch blog where you've seen that? i'm sorry i've never seen merging commonly done, but i've mostly worked in NLP domain, so it maybe be possible.

Please let me know if you seen those cases in any popular (can be trusted) brand. Thank you!

@haocheng6
Copy link
Author

@shivam13juna I think I was wrong. 😢 Seems that it would be better to embed categorical features into a continuous space. (See How to combine categorical and continuous input features for neural network training.)

This tutorial from Tensorflow uses many types of features in one input layer, but it seems that its purpose is only for demonstration.

Please let me know if you have found any other good resources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants