Skip to content

Commit

Permalink
Merge pull request #124 from gustavdelius/typos
Browse files Browse the repository at this point in the history
Updating section that looks at Flux.jl code, which has changed
  • Loading branch information
ChrisRackauckas committed May 3, 2023
2 parents 1115c17 + 7aed018 commit 9228207
Show file tree
Hide file tree
Showing 3 changed files with 31 additions and 17 deletions.
40 changes: 27 additions & 13 deletions _weave/lecture03/sciml.jmd
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ weave_options:

Here we will start to dig into what scientific machine learning is all about
by looking at physics-informed neural networks. Let's start by understanding
what a neural network really is, why they are used, and what kinds of problems
that they solve, and then we will use this understanding of a neural network
what neural networks really are, why they are used, and what kinds of problems
they solve, and then we will use this understanding of a neural network
to see how to solve ordinary differential equations with neural networks.
For there, we will use this method to regularize neural networks with physical
equations, the aforementioned physics-informed neural network, and see how to
Expand Down Expand Up @@ -121,8 +121,8 @@ NN3(rand(10))
```

The second activation function there is what's known as a `relu`. A `relu` can
be good to use because it's an exceptionally operation and satisfies a form of
the UAT. However, a downside is that its derivative is not continuous, which
be good to use because it's an exceptionally fast operation and satisfies a form of
the universal approximation theorem (UAT). However, a downside is that its derivative is not continuous, which
could impact the numerical properties of some algorithms, and thus it's widely
used throughout standard machine learning but we'll see reasons why it may be
disadvantageous in some cases in scientific machine learning.
Expand All @@ -144,7 +144,7 @@ using InteractiveUtils
@which Dense(10 => 32,tanh)
```

If we go to that spot of the documentation, we find the following.
If we go to that spot of the code, we find the following:

```julia;eval=false
struct Dense{F, M<:AbstractMatrix, B}
Expand All @@ -164,14 +164,27 @@ end
```

First, `Dense` defines a struct in Julia. This struct just holds a weight matrix
`W`, a bias vector `b`, and an activation function `σ`. The function called
`Dense` is what's known as an **outer constructor** which defines how the
`Dense` type is built. If you give it two integers (and optionally an activation
`W`, a bias vector `b`, and an activation function `σ`. It also defines an **inner constructor**
that ensures that a created `Dense` object will have the desired properties and types
for its fields.
The function called `Dense` that is defined next, outside the `struct`, is what's known
as an **outer constructor** which provides a more convenient way to create a `Dense`
object. If you give it a `Pair` of integers (and optionally an activation
function which defaults to `identity`), then what it will do is take random
initial `W` and `b` matrices (according to the `glorot_uniform` distribution for
`W` and `zeros` for `b`), and then it will build the type with those matrices.

The last portion might be new. This is known as a **callable struct**, or a
The next portion might be new. We give it here in the simpler form it had in earlier
versions of the Flux package, so that we can concentrate on the essential:

```julia;eval=false
function (a::Dense)(x::AbstractArray)
W, b, σ = a.W, a.b, a.σ
σ.(W*x .+ b)
end
```

This defines what is known as a **callable struct**, or a
functor. It defines the dispatch for how calls work on the struct. As a quick
demonstration, let's define a type `MyCallableStruct` with a field `x`, and then make instances
of `A` be the function `x+y`:
Expand All @@ -191,7 +204,7 @@ an object in a way that references the `self`, though it's a bit more general
due to allowing dispatching, i.e. this can then dependent on the input types
as well.

So let's look at that `Dense` call with this in mind:
So let's look at `Dense` with this in mind:

```julia;eval=false
function (a::Dense)(x::AbstractArray)
Expand All @@ -216,7 +229,8 @@ inside of them. Now what does `Chain` do?
@which Chain(1,2,3)
```

gives us:
Again, for our explanations here we will look at the slightly simpler code From
and earlier version of the Flux package:

```julia;eval=false
struct Chain{T<:Tuple}
Expand Down Expand Up @@ -365,7 +379,7 @@ loss() = sum(abs2,sum(abs2,NN(rand(10)).-1) for i in 1:100)
loss()
```

This loss function takes 100 random points in `[0,1]` and then computes the output
This loss function takes 100 random points in ``[0,1]^{10}`` and then computes the output
of the neural network minus `1` on each of the values, and sums up the squared
values (`abs2`). Why the squared values? This means that every computed loss value
is positive, and so we know that by decreasing the loss this means that, on average
Expand Down Expand Up @@ -660,7 +674,7 @@ one-dimensional spring pushing and pulling against a wall.

But instead of the simple spring, let's assume we had a more complex spring,
for example, let's say ``F(x) = -kx + 0.1sin(x)`` where this extra term is due to
some deformities in the medal (assume mass=1). Then by Newton's law of motion
some deformities in the metal (assume mass=1). Then by Newton's law of motion
we have a second order ordinary differential equation:

```math
Expand Down
2 changes: 1 addition & 1 deletion course/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ weave = false

## Syllabus

**Pre-recorded online lectures are available to compliment the lecture notes**
**Pre-recorded online lectures are available to complement the lecture notes**

**Prerequisites**: While this course will be mixing ideas from high performance
computing, numerical analysis, and machine learning, no one in the course is
Expand Down
6 changes: 3 additions & 3 deletions index.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,15 +54,15 @@ modeling.

However, these methods will quickly run into a scaling issue if naively coded.
To handle this problem, everything will have a focus on performance-engineering.
We will start by focusing on algorithm which are inherently serial and
We will start by focusing on algorithms that are inherently serial and
learn to optimize serial code. Then we will showcase how logic-heavy
code can be parallelized through multithreading and distributed computing
techniques like MPI, while direct mathematical descriptions can be parallelized
through GPU computing.

The final part of the course will be a unique project which pulls together these
techniques. As a new field, the students will be exposed to the "low hanging
fruit" and will be directed towards an area which they can make a quick impact.
fruit" and will be directed towards an area in which they can make a quick impact.
For the final project, students will team up to solve a new problem in the field of
scientific machine learning, and receive helping writing up a publication-quality
scientific machine learning, and receive help in writing up a publication-quality
analysis about their work.

0 comments on commit 9228207

Please sign in to comment.