Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Axon ignoring the activation function? #526

Closed
a-alhusaini opened this issue Aug 26, 2023 · 5 comments
Closed

Axon ignoring the activation function? #526

a-alhusaini opened this issue Aug 26, 2023 · 5 comments

Comments

@a-alhusaini
Copy link

I have the following model

model =
  Axon.input("input_0", shape: {nil, 1, 28, 28})
  |> Axon.flatten()
  |> Axon.dense(128, activation: :relu)
  |> Axon.dense(10, activation: :sigmoid)

After training the model and using it the outputs are bigger than 1.

Based on my understanding of the sigmoid function, this shouldn't be possible. What am I missing?

@polvalente
Copy link
Contributor

Could you share the full code?

@a-alhusaini
Copy link
Author

train_data = Scidata.MNIST.download()
test_data = Scidata.MNIST.download_test()
{train_data, train_labels} = train_data
{train_binary, train_type, train_shape} = train_data
{train_label_binary, train_label_type, train_label_shape} = train_labels

train_y =
  Nx.from_binary(train_label_binary, train_label_type)
  |> Nx.reshape(train_label_shape)
  |> Nx.new_axis(-1)
  |> Nx.equal(Enum.to_list(0..9) |> Nx.tensor())
  |> Nx.to_batched(32)
  |> Enum.to_list()

train_x =
  Nx.from_binary(train_binary, train_type)
  |> Nx.reshape(train_shape)
  |> Nx.divide(255)
  |> Nx.to_batched(32)
  |> Enum.to_list()

data = Enum.zip(train_x, train_y)

training_count = floor(0.8 * Enum.count(data))
validation_count = floor(0.2 * training_count)

{training_data, test_data} = Enum.split(data, training_count)
{validation_data, training_data} = Enum.split(training_data, validation_count)
model =
  Axon.input("input_0", shape: {nil, 1, 28, 28})
  |> Axon.flatten()
  |> Axon.dense(128, activation: :relu)
  |> Axon.dense(10, activation: :sigmoid)
state =
  model
  |> Axon.Loop.trainer(:categorical_cross_entropy, Axon.Optimizers.adam(0.01))
  |> Axon.Loop.metric(:accuracy, "Accuracy")
  |> Axon.Loop.validate(model, validation_data)
  |> Axon.Loop.run(training_data, %{}, compiler: EXLA, epochs: 10)
model
|> Axon.Loop.evaluator()
|> Axon.Loop.metric(:accuracy, "Accuracy")
|> Axon.Loop.run(test_data, state)
first_number = Enum.at(train_x, 0)[0]
Axon.predict(model, state, first_number)

@polvalente
Copy link
Contributor

What's the Axon version used?

@polvalente
Copy link
Contributor

With Axon 0.6 and EXLA 0.6, I got this output:

#Nx.Tensor<
  f32[1][10]
  EXLA.Backend<host:0, 0.3007448411.1655570452.243887>
  [
    [2.3431260944184697e-23, 4.296128036651581e-11, 1.0659236316176752e-17, 0.9999992847442627, 6.712944780263512e-22, 1.0, 2.8145804840526482e-22, 2.4531429665408666e-10, 9.463095196338145e-9, 2.6594868813845096e-6]
  ]
>

Maybe the confusion stems from the scientific notation?

All but the 4th number there are far smaller than 1. Note the e-23 suffix to the first entry. That means that it's 2.34... * 10 ** -23. Likewise, the second is 4.29 * 10 ** -11 and so on.

@a-alhusaini
Copy link
Author

Ah, sorry, I didn't notice the notation. I use a screen reader and was only listening to the first few numbers of every tensor (it takes a long time for screen reader to read all those numbers). It turns out I never reached the end of any tensor, there the scientific notation is written..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants