Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Foundations --> Embeddings #203

Closed
gitgithan opened this issue Oct 19, 2021 · 3 comments
Closed

Foundations --> Embeddings #203

gitgithan opened this issue Oct 19, 2021 · 3 comments

Comments

@gitgithan
Copy link

gitgithan commented Oct 19, 2021

  1. Typo under Model section: 3. We'll apply convolution via filters (filter_size, vocab_size, num_filters) should be embedding_dim to replace vocab_size?
  2. Typo under Experiments: first have to decice
  3. Typo under Interpretability padding our inputs before convolution to result is outputs is should be in
  4. Could there be a general explanation of moving models/data across devices? My current understanding is that they have to be both on the same place (cpu/gpu). If on gpu, just stay on gpu through the whole train/eval/predict session. I couldn't understand why under Inference device = torch.device("cpu") moves things back to cpu.
  5. interpretable_trainer.predict_step(dataloader) breaks with AttributeError: 'list' object has no attribute 'dim'. The precise step is F.softmax(z), where for interpretable_model, z is a list of 3 items and it was trying to softmax a list instead of a tensor.
@GokuMohandas
Copy link
Owner

1-3: thanks for these.
4. You usually want to do inference on CPU/TPU (optimized for forward pass) so you don't waste a machine with a GPU (which is usually reserved for training with backprop).
5. good catch, I've updated the code on the webpage to look like this now:

# Get conv outputs
interpretable_model.eval()
conv_outputs = []
with torch.inference_mode():
    for i, batch in enumerate(dataloader):

        # Forward pass w/ inputs
        inputs, targets = batch[:-1], batch[-1]
        z = interpretable_model(inputs)

        # Store conv outputs
        conv_outputs.extend(z)

conv_outputs = np.vstack(conv_outputs)
print (conv_outputs.shape) # (len(filter_sizes), num_filters, max_seq_len)

@JellenRoberts
Copy link

JellenRoberts commented Oct 20, 2021 via email

@GokuMohandas
Copy link
Owner

hey @JellenRoberts , I'm sorry but I'm not entirely sure either but I agree that we don't have to open issues this frequently. @gitgithan I sent you a LinkedIn request so let's chat on there for all future minor issues and clarifications because I think more than a thousand people are watching this repo so they are actively getting emails for every single conversation. We can share any large implications here for the whole community as we come across them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants