Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated model script formats #210

Merged
merged 20 commits into from
Apr 30, 2020

Conversation

AdarshKumar712
Copy link
Contributor

@AdarshKumar712 AdarshKumar712 commented Mar 4, 2020

I have updated the script formats of the following models:

  1. vision/mnist/mlp.jl
  2. other/iris/iris.jl

I have also added Project.toml and Manifest.toml files for iris dataset. Will add Project.toml and Manifest.toml files for the mnist models as well, once there script format are updated. I am currently working on them. I will add them as well pretty soon. Please review these updates, and suggest further changes.

Update:
I have updated following models as well:

  1. other/housing/housing.jl
  2. vision/mnist/conv.jl
  3. vision/mnist/vae.jl
  4. vision/mnist/autoencoders.jl
  5. vision/mnist/cifar10/cifar10.jl
  6. vision/mnist/cppn/cppn.jl
  7. vision/dcgan_mnist/dcgan.jl
  8. other/fizzbuzz/fizzbuzz.jl
  9. other/bitstring-parity/xor1.jl, xor2.jl, xor3.jl
  10. text/char-rnn/char-rnn.jl
  11. text/lang-detection/model.jl
  12. text/phonemes/0-data.jl, 1-model.jl
  13. text/treebank/data.jl, recursive.jl

@DhairyaLGandhi
Copy link
Member

We should avoid adding environments per model, it's one thing that made the zoo very difficult to maintain

@AdarshKumar712
Copy link
Contributor Author

Ok, then I will remove local environment files from here, and from some other places as well, and instead will Update the global environment file.

@CarloLucibello
Copy link
Member

CarloLucibello commented Mar 5, 2020

We should avoid adding environments per model, it's one thing that made the zoo very difficult to maintain

Do you have a reference for this discussion? I would say quite the opposite: a general manifest for all of the examples in unmaintainable. No one will ever update all the scripts at the same time, so some of them will rot and some will go out of sync with the general manifest. Kind of the current situation

@AdarshKumar712
Copy link
Contributor Author

AdarshKumar712 commented Mar 5, 2020

I have a doubt. In the Conv.jl, we are saving the parameters as:
BSON.@save joinpath(dirname(@__DIR__), "mnist_conv.bson") params=cpu.(params(model)) epoch_idx acc
How one is going reload the model using this way? Shouldn't we add how models can be reloaded as well?
@dhairyagandhi96 @CarloLucibello

other/iris/iris.jl Outdated Show resolved Hide resolved
@CarloLucibello
Copy link
Member

I have a doubt. In the Conv.jl, we are saving the parameters as:
BSON.@save joinpath(dirname(@__DIR__), "mnist_conv.bson") params=cpu.(params(model)) epoch_idx acc
How one is going reload the model using this way? Shouldn't we add how models can be reloaded as well?
@dhairyagandhi96 @CarloLucibello

you can use load_params!. It won't work correctly for models with buffers such as batch norm layers, since won't be saved and reloaded, but for a simple conv net it's fine

@AdarshKumar712
Copy link
Contributor Author

AdarshKumar712 commented Mar 6, 2020

@dhairyagandhi96 @CarloLucibello I have updated few other models. Please have a look, and suggest changes wherever required.

@DhairyaLGandhi
Copy link
Member

ref #173

@DhairyaLGandhi
Copy link
Member

The general manifest is expected to work with patch updates, else its a bug, and a major update would largely require a rewrite anyway

@CarloLucibello
Copy link
Member

The general manifest is expected to work with patch updates, else its a bug, and a major update would largely require a rewrite anyway

problem is that after a major update no one is going to undertake the update of the whole repo (we have just seen this). What we can hope for (and what we consistently got in the past months) are pointwise PRs updating single scripts. This is why we should just have one manifest per example to have both manageability and consistency

@DhairyaLGandhi
Copy link
Member

We had that earlier and it lead to a case where we missed the guarantee that the models would update accordingly. Maintaining multiple environments was also error prone. This time the reason was that we waited for some Zygote bugs to be resolved, a large change which took time to stabilize

@AdarshKumar712
Copy link
Contributor Author

AdarshKumar712 commented Mar 13, 2020

Sorry I had been busy writing my GSoC Proposal, because of which I wasn't able to work on this PR. Its almost done now, I will soon update format of other files as well.
Btw, @dhairyagandhi96 how about having individual Manifest and Project files for each section? Like Vision, Text, Others. Manifest.toml, Project.toml files for each one. This way it would be easier to maintain each section separately.

@AdarshKumar712
Copy link
Contributor Author

@CarloLucibello @dhairyagandhi96 I have updated all models in text, other and vision section, except for one flux-next as I was not clear about its purpose. Please do have a look at the changes and suggest improvements.

@jamblejoe
Copy link

Is the device keyword of the Args struct in mlp.jl ever used? Is the model actually trained on the gpu?

@AdarshKumar712
Copy link
Contributor Author

Sorry @jamblejoe, I haven't checked for the GPU support for all the scripts. This device keyword was left unused by mistake. I will update this shortly. Thanks for reminding! :)

@DhairyaLGandhi
Copy link
Member

What remains to be done here?

@AdarshKumar712
Copy link
Contributor Author

AdarshKumar712 commented Apr 24, 2020

@dhairyagandhi96 I have updated all the models except others/Flux-next and contrib section. All models work with CPU. But the text models throws an error with the GPU because of Onehotmatrix. Other than that vision models work fine with GPU.

Also for char-rnn.jl, the model performs better without reset command there. So I haven't included that in char-rnn.jl .

vision/mnist/mlp.jl Outdated Show resolved Hide resolved

function loss(xs, ys)
l = sum(crossentropy.(m.(gpu.(xs)), gpu.(ys)))
l = sum(logitcrossentropy.(m.(xs), ys))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it ok to have a broadcast here?

Copy link
Contributor Author

@AdarshKumar712 AdarshKumar712 Apr 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, here without the broadcast, it throws the following error:

MethodError: no method matching isless(::Array{Float32,2}, ::Array{Float32,2})
Closest candidates are:
  isless(!Matched::Missing, ::Any) at missing.jl:87
  isless(::Any, !Matched::Missing) at missing.jl:88

as here xs is an array of 50 elements where each element is a 2D array (N x 50) which is input for the model. However, I think here it would be better to have mean rather than sum. I was not much sure on that. So I left it as it was earlier.

@@ -79,7 +79,7 @@ function train(; kws...)
@info("Constructing Model...")
state, encode = Construct_model(args)

loss(x, yo, y) = sum(crossentropy.(model(x, yo, state, encode), y))
loss(x, yo, y) = sum(logitcrossentropy.(model(x, yo, state, encode), y))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here, do we want to broadcast?

Copy link
Contributor Author

@AdarshKumar712 AdarshKumar712 Apr 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also had the same problem as above.

@CarloLucibello
Copy link
Member

ok, let's rebase and merge this

vision/mnist/mlp.jl Outdated Show resolved Hide resolved
@CarloLucibello CarloLucibello merged commit 4bf8476 into FluxML:master Apr 30, 2020
@CarloLucibello
Copy link
Member

thanks for all this work!

@MikeInnes
Copy link
Member

MikeInnes commented May 12, 2020

See #226 (comment), but also this PR did not update the project or manifest files, so the code does not currently run.

@AdarshKumar712
Copy link
Contributor Author

AdarshKumar712 commented May 13, 2020

The project file was not updated earlier because it was unclear how were we going to handle project dependencies. Though Dhairya earlier pointed out(comment) that having many project files makes it harder to handle, however at the same time having a single project file makes it tough at the time of making updates(comment). To this I suggested that it would be better if we have a section-wise Project and Manifest file like for Vision, Text, Other. However, this discussion didn't go to a conclusion at that time, and remain undecided.
I apologize that I missed pointing it out earlier. I hope it won't take much time to update the project files once it's decided on how we are going to keep the project files.

@MikeInnes
Copy link
Member

Having code in the zoo that needs a project file but doesn't have one definitely seems like the worst of all worlds.

@AdarshKumar712
Copy link
Contributor Author

I have added the Project files, #232

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants