Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature_spec dnn example #829

Closed
wants to merge 1 commit into from

Conversation

dfalbel
Copy link
Member

@dfalbel dfalbel commented Jul 1, 2019

@skeydan here's the current code for the feature_spec example

@skeydan
Copy link

skeydan commented Jul 1, 2019

great, thanks! I nearly would have clicked "merge" but we should probably merge rstudio/tfdatasets#42 first ;-)
Do you think we could do that now? In the example I'm getting

 Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 
  'x' must be atomic 
16.
stop("'x' must be atomic") 
15.
sort.int(x, na.last = na.last, decreasing = decreasing, ...) 
14.
sort.default(unique(c(self$vocabulary_list_aux, unq))) 
13.
sort(unique(c(self$vocabulary_list_aux, unq))) at feature_spec.R#578
12.
self$steps[[i]]$fit_batch(nxt) at feature_spec.R#198
11.
spec$fit() at feature_spec.R#984
10.
fit.FeatureSpec(.) 
9.
fit(.) 
8.
function_list[[k]](value) 
7.
withVisible(function_list[[k]](value)) 
6.
freduce(value, `_function_list`) 
5.
`_fseq`(`_lhs`) 
4.
eval(quote(`_fseq`(`_lhs`)), env, env) 
3.
eval(quote(`_fseq`(`_lhs`)), env, env) 
2.
withVisible(eval(quote(`_fseq`(`_lhs`)), env, env)) 
1.
training %>% select(-id) %>% feature_spec(target ~ .) %>% step_numeric_column(ends_with("bin")) %>% 
    step_numeric_column(-ends_with("bin"), -ends_with("cat"), 
        normalizer_fn = scaler_standard()) %>% step_categorical_column_with_vocabulary_list(ends_with("cat")) %>% 
    step_embedding_column(ends_with("cat"), dimension = function(vocab_size) as.integer(sqrt(vocab_size) +  ... 

... perhaps because I've not updated tfdatasets from the PR?

@skeydan
Copy link

skeydan commented Jul 1, 2019

Actually one suggestion...

You know what, I had started thinking about this (which hadn't been there before)

porto <- porto %>%
  mutate_at(vars(ends_with("cat")), as.character)

and so at first I was thinking, ok, we want to tell the "text embeddings story" so yeah, it's a bit artificial but why not ... but now I'm back to what had been my very first reaction on seeing the example, namely, that texts normally have more than one word anyway and that 's what people will want to see?

What do you think about we use

step_indicator_column(ends_with("cat")) %>%`

and frame the story as just some unknown cat. variable, not (necessarily) text?
And have sth aabout text at a later time...

@dfalbel dfalbel closed this Oct 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants