Skip to content

Commit

Permalink
Merge pull request #26 from JuliaPOMDP/pomdps-08-compat
Browse files Browse the repository at this point in the history
POMDPs v0.8 Compatibility
  • Loading branch information
zsunberg committed Sep 20, 2019
2 parents 34154e2 + 44cf4cd commit 1e4096e
Show file tree
Hide file tree
Showing 29 changed files with 320 additions and 235 deletions.
1 change: 1 addition & 0 deletions .travis.yml
Expand Up @@ -5,6 +5,7 @@ os:

julia:
- 1.0
- 1

notifications:
email: false
Expand Down
6 changes: 4 additions & 2 deletions Project.toml
@@ -1,7 +1,7 @@
name = "POMDPModelTools"
uuid = "08074719-1b2a-587c-a292-00f91cc44415"
authors = ["JuliaPOMDP Contributors"]
version = "0.1.7"
version = "0.2.0"

[deps]
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
Expand All @@ -14,14 +14,16 @@ UnicodePlots = "b8865327-cd53-5732-bb35-84acbb429228"

[compat]
Distributions = ">= 0.17"
POMDPs = "0.7.3, 0.8.0"
julia = "1"

[extras]
BeliefUpdaters = "8bb6e9a1-7d73-552c-a44a-e5dc5634aac4"
POMDPModels = "355abbd5-f08e-5560-ac9e-8b5f2592a0ca"
POMDPPolicies = "182e52fb-cfd0-5e46-8c26-fd0667c990f4"
POMDPSimulators = "e0d0a172-29c6-5d4e-96d0-f262df5d01fd"
Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

[targets]
test = ["Test", "POMDPModels", "POMDPSimulators", "POMDPPolicies", "Pkg"]
test = ["Test", "POMDPModels", "POMDPSimulators", "POMDPPolicies", "BeliefUpdaters", "Pkg"]
10 changes: 3 additions & 7 deletions docs/make.jl
@@ -1,17 +1,13 @@
push!(LOAD_PATH, "../src/")

using Documenter, POMDPModelTools

makedocs(
modules = [POMDPModelTools],
format = :html,
format = Documenter.HTML(),
sitename = "POMDPModelTools.jl"
)

deploydocs(
repo = "github.com/JuliaPOMDP/POMDPModelTools.jl.git",
julia = "1.0",
osname = "linux",
target = "build",
deps = nothing,
make = nothing
)

23 changes: 0 additions & 23 deletions docs/mkdocs.yml

This file was deleted.

2 changes: 1 addition & 1 deletion docs/src/index.md
@@ -1,6 +1,6 @@
# About

POMDPModelTools is a collection of interface extensions and tools to make writing models and solvers for [POMDPs.jl](github.com/JuliaPOMDP/POMDPs.jl) easier.
POMDPModelTools is a collection of interface extensions and tools to make writing models and solvers for [POMDPs.jl](https://github.com/JuliaPOMDP/POMDPs.jl) easier.

```@contents
```
10 changes: 8 additions & 2 deletions docs/src/interface_extensions.md
Expand Up @@ -35,9 +35,15 @@ ordered_observations

It is often the case that useful information besides the belief, state, action, etc is generated by a function in POMDPs.jl. This information can be useful for debugging or understanding the behavior of a solver, updater, or problem. The info interface provides a standard way for problems, policies, solvers or updaters to output this information. The recording simulators from [POMDPSimulators.jl](https://github.com/JuliaPOMDP/POMDPSimulators.jl) automatically record this information.

To specify info for a problem (in POMDPs v0.8 and above), one should modify the problem's DDN with the `add_infonode` function, then return the info in `gen`. There is an example of this pattern in the docstring below:

```@docs
add_infonode
```

To specify info from policies, solvers, or updaters, implement the following functions:

```@docs
generate_sri
generate_sori
action_info
solve_info
update_info
Expand Down
13 changes: 7 additions & 6 deletions src/POMDPModelTools.jl
Expand Up @@ -6,9 +6,9 @@ using LinearAlgebra
using SparseArrays
using UnicodePlots

import POMDPs: actions, n_actions, actionindex
import POMDPs: states, n_states, stateindex
import POMDPs: observations, n_observations, obsindex
import POMDPs: actions, actionindex
import POMDPs: states, stateindex
import POMDPs: observations, obsindex
import POMDPs: sampletype, generate_sr, initialstate, isterminal, discount
import POMDPs: implemented
import Distributions: pdf, mode, mean, support
Expand All @@ -22,11 +22,12 @@ include("visualization.jl")

# info interface
export
generate_sri,
generate_sori,
add_infonode,
action_info,
solve_info,
update_info
update_info,
generate_sri,
generate_sori
include("info.jl")

export
Expand Down
5 changes: 0 additions & 5 deletions src/distributions/bool.jl
Expand Up @@ -23,11 +23,6 @@ function Base.iterate(d::BoolDistribution, state::Bool)
end

support(d::BoolDistribution) = [true, false]

==(d1::BoolDistribution, d2::BoolDistribution) = d1.p == d2.p

Base.hash(d::BoolDistribution) = hash(d.p)

Base.length(d::BoolDistribution) = 2

Base.show(io::IO, m::MIME"text/plain", d::BoolDistribution) = showdistribution(io, m, d, title="BoolDistribution")
1 change: 1 addition & 0 deletions src/distributions/deterministic.jl
Expand Up @@ -13,6 +13,7 @@ rand(rng::AbstractRNG, d::Deterministic) = d.val
rand(d::Deterministic) = d.val
support(d::Deterministic) = (d.val,)
sampletype(::Type{Deterministic{T}}) where T = T
Random.gentype(::Type{Deterministic{T}}) where T = T
pdf(d::Deterministic, x) = convert(Float64, x == d.val)
mode(d::Deterministic) = d.val
mean(d::Deterministic{N}) where N<:Number = d.val / 1 # / 1 is to make this return a similar type to Statistics.mean
Expand Down
1 change: 1 addition & 0 deletions src/distributions/sparse_cat.jl
Expand Up @@ -90,6 +90,7 @@ end
Base.length(d::SparseCat) = min(length(d.vals), length(d.probs))
Base.eltype(D::Type{SparseCat{V,P}}) where {V, P} = Pair{eltype(V), eltype(P)}
sampletype(D::Type{SparseCat{V,P}}) where {V, P} = eltype(V)
Random.gentype(D::Type{SparseCat{V,P}}) where {V, P} = eltype(V)

function mean(d::SparseCat)
vsum = zero(eltype(d.vals))
Expand Down
2 changes: 2 additions & 0 deletions src/distributions/uniform.jl
Expand Up @@ -24,6 +24,7 @@ end

support(d::Uniform) = d.set
sampletype(::Type{Uniform{T}}) where T = eltype(T)
Random.gentype(::Type{Uniform{T}}) where T = eltype(T)

function pdf(d::Uniform, s)
if s in d.set
Expand All @@ -49,6 +50,7 @@ end
pdf(d::UnsafeUniform, s) = 1.0/length(d.collection)
support(d::UnsafeUniform) = d.collection
sampletype(::Type{UnsafeUniform{T}}) where T = eltype(T)
Random.gentype(::Type{UnsafeUniform{T}}) where T = eltype(T)

# Common Implementations

Expand Down
54 changes: 37 additions & 17 deletions src/fully_observable_pomdp.jl
Expand Up @@ -3,24 +3,33 @@
Turn `MDP` `mdp` into a `POMDP` where the observations are the states of the MDP.
"""
struct FullyObservablePOMDP{S, A} <: POMDP{S,A,S}
mdp::MDP{S, A}
struct FullyObservablePOMDP{M,S,A} <: POMDP{S,A,S}
mdp::M
end

function FullyObservablePOMDP(m::MDP)
return FullyObservablePOMDP{typeof(m), statetype(m), actiontype(m)}(m)
end

mdptype(::Type{FullyObservablePOMDP{M,S,A}}) where {M,S,A} = M

function POMDPs.DDNStructure(::Type{M}) where M <: FullyObservablePOMDP
MM = mdptype(M)
add_obsnode(DDNStructure(MM))
end

add_obsnode(ddn) = add_node(ddn, :o, FunctionDDNNode((m,sp)->sp), (:sp,)) # for ::DDNStructure, but this is not declared yet POMDPs in v0.7.3

POMDPs.observations(pomdp::FullyObservablePOMDP) = states(pomdp.mdp)
POMDPs.n_observations(pomdp::FullyObservablePOMDP) = n_states(pomdp.mdp)
POMDPs.obsindex(pomdp::FullyObservablePOMDP{S, A}, o::S) where {S, A} = stateindex(pomdp.mdp, o)

POMDPs.convert_o(T::Type{V}, o, pomdp::FullyObservablePOMDP) where {V<:AbstractArray} = convert_s(T, s, pomdp.mdp)
POMDPs.convert_o(T::Type{S}, vec::V, pomdp::FullyObservablePOMDP) where {S,V<:AbstractArray} = convert_s(T, vec, pomdp.mdp)

POMDPs.gen(::DDNNode{:o}, m::FullyObservablePOMDP, sp, rng) = sp

function POMDPs.generate_o(pomdp::FullyObservablePOMDP, s, a, rng::AbstractRNG)
return s
end

function POMDPs.observation(pomdp::FullyObservablePOMDP, s, a)
return Deterministic(s)
function POMDPs.observation(pomdp::FullyObservablePOMDP, a, sp)
return Deterministic(sp)
end

function POMDPs.observation(pomdp::FullyObservablePOMDP, s, a, sp)
Expand All @@ -31,19 +40,30 @@ end

POMDPs.states(pomdp::FullyObservablePOMDP) = states(pomdp.mdp)
POMDPs.actions(pomdp::FullyObservablePOMDP) = actions(pomdp.mdp)
POMDPs.transition(pomdp::FullyObservablePOMDP{S,A}, s::S, a::A) where {S,A} = transition(pomdp.mdp, s, a)
POMDPs.transition(pomdp::FullyObservablePOMDP, s, a) = transition(pomdp.mdp, s, a)
POMDPs.initialstate_distribution(pomdp::FullyObservablePOMDP) = initialstate_distribution(pomdp.mdp)
POMDPs.initialstate(pomdp::FullyObservablePOMDP, rng::AbstractRNG) = initialstate(pomdp.mdp, rng)
POMDPs.generate_s(pomdp::FullyObservablePOMDP, s, a, rng::AbstractRNG) = generate_s(pomdp.mdp, s, a, rng)
POMDPs.generate_sr(pomdp::FullyObservablePOMDP, s, a, rng::AbstractRNG) = generate_sr(pomdp.mdp, s, a, rng)
POMDPs.reward(pomdp::FullyObservablePOMDP{S, A}, s::S, a::A) where {S,A} = reward(pomdp.mdp, s, a)
POMDPs.isterminal(pomdp::FullyObservablePOMDP, s) = isterminal(pomdp.mdp, s)
POMDPs.discount(pomdp::FullyObservablePOMDP) = discount(pomdp.mdp)
POMDPs.n_states(pomdp::FullyObservablePOMDP) = n_states(pomdp.mdp)
POMDPs.n_actions(pomdp::FullyObservablePOMDP) = n_actions(pomdp.mdp)
POMDPs.stateindex(pomdp::FullyObservablePOMDP{S,A}, s::S) where {S,A} = stateindex(pomdp.mdp, s)
POMDPs.actionindex(pomdp::FullyObservablePOMDP{S, A}, a::A) where {S,A} = actionindex(pomdp.mdp, a)
POMDPs.stateindex(pomdp::FullyObservablePOMDP, s) = stateindex(pomdp.mdp, s)
POMDPs.actionindex(pomdp::FullyObservablePOMDP, a) = actionindex(pomdp.mdp, a)
POMDPs.convert_s(T::Type{V}, s, pomdp::FullyObservablePOMDP) where V<:AbstractArray = convert_s(T, s, pomdp.mdp)
POMDPs.convert_s(T::Type{S}, vec::V, pomdp::FullyObservablePOMDP) where {S,V<:AbstractArray} = convert_s(T, vec, pomdp.mdp)
POMDPs.convert_a(T::Type{V}, a, pomdp::FullyObservablePOMDP) where V<:AbstractArray = convert_a(T, a, pomdp.mdp)
POMDPs.convert_a(T::Type{A}, vec::V, pomdp::FullyObservablePOMDP) where {A,V<:AbstractArray} = convert_a(T, vec, pomdp.mdp)

POMDPs.gen(d::DDNNode, m::FullyObservablePOMDP, args...) = gen(d, m.mdp, args...)
POMDPs.gen(m::FullyObservablePOMDP, s, a, rng) = gen(m.mdp, s, a, rng)
POMDPs.reward(pomdp::FullyObservablePOMDP, s, a) = reward(pomdp.mdp, s, a)

# deprecated in POMDPs v0.8
add_obsnode(ddn::POMDPs.DDNStructureV7{(:s,:a,:sp,:r)}) = POMDPs.DDNStructureV7{(:s,:a,:sp,:o,:r)}()
add_obsnode(ddn::POMDPs.DDNStructureV7) = error("FullyObservablePOMDP only supports MDPs with the standard DDN Structure (DDNStructureV7{(:s,:a,:sp,:r)}) with POMDPs v0.7.")

POMDPs.generate_s(pomdp::FullyObservablePOMDP, s, a, rng::AbstractRNG) = generate_s(pomdp.mdp, s, a, rng)
POMDPs.generate_sr(pomdp::FullyObservablePOMDP, s, a, rng::AbstractRNG) = generate_sr(pomdp.mdp, s, a, rng)
POMDPs.n_actions(pomdp::FullyObservablePOMDP) = n_actions(pomdp.mdp)
POMDPs.n_states(pomdp::FullyObservablePOMDP) = n_states(pomdp.mdp)
function POMDPs.generate_o(pomdp::FullyObservablePOMDP, s, rng::AbstractRNG)
return s
end
11 changes: 8 additions & 3 deletions src/generative_belief_mdp.jl
Expand Up @@ -14,15 +14,20 @@ function GenerativeBeliefMDP(pomdp::P, up::U) where {P<:POMDP, U<:Updater}
GenerativeBeliefMDP{P, U, typeof(b0), actiontype(pomdp)}(pomdp, up)
end

function generate_sr(bmdp::GenerativeBeliefMDP, b, a, rng::AbstractRNG)
function POMDPs.gen(bmdp::GenerativeBeliefMDP, b, a, rng::AbstractRNG)
s = rand(rng, b)
if isterminal(bmdp.pomdp, s)
bp = gbmdp_handle_terminal(bmdp.pomdp, bmdp.updater, b, s, a, rng::AbstractRNG)::typeof(b)
return bp, 0.0
end
sp, o, r = generate_sor(bmdp.pomdp, s, a, rng) # maybe this should have been generate_or?
sp, o, r = gen(DDNOut(:sp,:o,:r), bmdp.pomdp, s, a, rng) # maybe this should have been generate_or?
bp = update(bmdp.updater, b, a, o)
return bp, r
return (sp=bp, r=r)
end

function generate_sr(bmdp::GenerativeBeliefMDP, b, a, rng::AbstractRNG)
x = gen(bmdp, b, a, rng)
return x.sp, x.r
end

function initialstate(bmdp::GenerativeBeliefMDP, rng::AbstractRNG)
Expand Down
93 changes: 75 additions & 18 deletions src/info.jl
@@ -1,24 +1,6 @@
# functions for passing out info from simulations, similar to the info return from openai gym
# maintained by @zsunberg

"""
Return a tuple containing the next state and reward and information (usually a `NamedTuple`, `Dict` or `nothing`) from that step.
By default, returns `nothing` as info.
"""
function generate_sri(p::MDP, s, a, rng::AbstractRNG)
return generate_sr(p, s, a, rng)..., nothing
end

"""
Return a tuple containing the next state, observation, and reward and information (usually a `NamedTuple`, `Dict` or `nothing`) from that step.
By default, returns `nothing` as info.
"""
function generate_sori(p::POMDP, s, a, rng::AbstractRNG)
return generate_sor(p, s, a, rng)..., nothing
end

"""
a, ai = action_info(policy, x)
Expand Down Expand Up @@ -51,3 +33,78 @@ By default, returns `nothing` as info.
function update_info(up::Updater, b, a, o)
return update(up, b, a, o), nothing
end

# once POMDPs v0.8 is released, this should be a jldoctest
"""
add_infonode(ddn::DDNStructure)
Create a new DDNStructure object with a new node labeled :info for returning miscellaneous informationabout a simulation step.
Typically, the object in info is associative (i.e. a `Dict` or `NamedTuple`) with keys corresponding to different pieces of information.
# Example (using POMDPs v0.8)
```julia
using POMDPs, POMDPModelTools, POMDPPolicies, POMDPSimulators, Random
struct MyMDP <: MDP{Int, Int} end
# add the info node to the DDN
POMDPs.DDNStructure(::Type{MyMDP}) = mdp_ddn() |> add_infonode
# the dynamics involve two random numbers - here we record the values for each in info
function POMDPs.gen(m::MyMDP, s, a, rng)
r1 = rand(rng)
r2 = randn(rng)
return (sp=s+a+r1+r2, r=s^2, info=(r1=r1, r2=r2))
end
m = MyMDP()
@show nodenames(DDNStructure(m))
p = FunctionPolicy(s->1)
for (s,info) in stepthrough(m, p, 1, "s,info", max_steps=5, rng=MersenneTwister(2))
@show s
@show info
end
```
"""
function add_infonode(ddn) # for DDNStructure, but it is not declared in v0.7.3, so there is not annotation
add_node(ddn, :info, ConstantDDNNode(nothing), nodenames(ddn))
end

function add_infonode(ddn::POMDPs.DDNStructureV7{nodenames}) where nodenames
return POMDPs.DDNStructureV7{(nodenames..., :info)}()
end

###############################################################
# Note all generate functions will be deprecated in POMDPs v0.8
###############################################################


if DDNStructure(MDP) isa POMDPs.DDNStructureV7
"""
Return a tuple containing the next state and reward and information (usually a `NamedTuple`, `Dict` or `nothing`) from that step.
By default, returns `nothing` as info.
"""
function generate_sri(p::MDP, s, a, rng::AbstractRNG)
return generate_sr(p, s, a, rng)..., nothing
end

"""
Return a tuple containing the next state, observation, and reward and information (usually a `NamedTuple`, `Dict` or `nothing`) from that step.
By default, returns `nothing` as info.
"""
function generate_sori(p::POMDP, s, a, rng::AbstractRNG)
return generate_sor(p, s, a, rng)..., nothing
end

POMDPs.gen(::DDNOut{(:sp,:o,:r,:i)}, m, s, a, rng) = generate_sori(m, s, a, rng)
POMDPs.gen(::DDNOut{(:sp,:o,:r,:info)}, m, s, a, rng) = generate_sori(m, s, a, rng)
POMDPs.gen(::DDNOut{(:sp,:r,:i)}, m, s, a, rng) = generate_sri(m, s, a, rng)
POMDPs.gen(::DDNOut{(:sp,:r,:info)}, m, s, a, rng) = generate_sri(m, s, a, rng)
else
@deprecate generate_sri(args...) gen(DDNOut(:sp,:r,:info), args...)
@deprecate generate_sori(args...) gen(DDNOut(:sp,:o,:r,:info), args...)
end

2 comments on commit 1e4096e

@zsunberg
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/3687

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if Julia TagBot is installed, or can be done manually through the github interface, or via:

git tag -a v0.2.0 -m "<description of version>" 1e4096eb09900033bd0f01b13f6145002e675e3d
git push origin v0.2.0

Please sign in to comment.