-
University of California Berkeley
- Berkeley, California
- http://gleave.me
Highlights
- Pro
Block or Report
Block or report AdamGleave
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abusePinned
-
hill-a/stable-baselines Public
Forked from openai/baselines
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
-
Find best-response to a fixed policy in multi-agent RL
-
HumanCompatibleAI/imitation Public
Clean PyTorch implementations of imitation and reward learning algorithms
-
HumanCompatibleAI/seals Public
Benchmark environments for reward modelling and imitation learning algorithms.
-
(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards
-
295 contributions in the last year
Contribution activity
June 2022
Created 4 commits in 1 repository
Created a pull request in HumanCompatibleAI/imitation that received 2 comments
Expose name_to_{value,count,excluded} maps in HierarchicalLogger
SB3 Logger has several (publicly accessible) maps name_to_value
, name_to_count
and name_to_excluded
that store intermediate values prior to being "…
Reviewed 12 pull requests in 2 repositories
HumanCompatibleAI/imitation
11 pull requests
- Allow train_rl.py script to load a saved policy to continue training
- Created EMA normalization layer
- Customizable schedules for querying preferences
- Load expert models for testing from huggingface hub
- Implement FIFO buffer for preference comparison labels; a few related fixes
- Preference comparisons example improvement
- Switch to new .npz based format for trajectory serialization
- Add Soft Actor Critic to imitation experiments
- Improve MCE IRL example notebook
- Change dependencies for Python 3.10 support
- Refactored serialize.py to allow for de-serialization of normalized reward functions