-
Notifications
You must be signed in to change notification settings - Fork 9
Quickstart Guide
Software | Version | Required |
---|---|---|
JSBSim | 1.1.5 (GitHub build 277) | default Simulator |
Flightgear | 2020.3.6 | optional – used for JSBSim rendering |
XPlane11 | 11.50r3 (build 115033 64-bit, OpenGL) | optional – used for testing |
XPlaneConnect | 1.3-rc.2 | optional - used for XPlane |
Python | 3.8.2 | Yes |
numpy | 1.19.4 | Yes |
Matplotlib | 3.3.2 | Yes |
tensorflow | 2.3.0 | Yes |
Anaconda | 4.9.2 | No |
Windows | 1909 | Code also runs on Linux |
Other versions might also work, these are the ones QPlane is tested with.
Once all the software packages are installed, clone the github repository into the desired directory on your computer.
git clone https://github.com/JDatPNW/QPlane
After successfully cloning the repo, navigate into the JSBSim folder to clone the JSBSim repo (in case JSBSim is your simulator of choice)
cd src/environments/jsbsim
git clone https://github.com/JSBSim-Team/jsbsim
JSBSim also needs to be installed, instructions are found here
If you want rendering to be available to the JSBSim simulator, then please follow the installation instructions for FlightGear here
Open FlightGear and Navigate to the Settings page. Before starting FligthGear please enter:
--fdm=null --native-fdm=socket,in,60,localhost,5550,udp --aircraft=c172r --airport=RKJJ
In the “Additional Settings” textbox at the bottom of the Settings page.
To install XPlane support you need to purchase download the XPlane simulator. More information can be found here.
Once the game is installed you will need to follow the instructions in the NASA XPC repository to install XPC. Instructions can be found here.
When using XPlane, the game has to running at all times!
If you want to use the deep RL algorithms, then please install tensorflow following the instructions here
After all the software is installed you can proceed to test if QPlane works.
For that go into the repos root folder and run the main file:
python QPlane.py
If you want to run QPlane with different modules than the default modules, then the include paths need to be adjusted to work with the desired module. All the built in modules that come with QPlane have their paths mentioned in the comments next to the includes.
Examples of that could look like this:
Before:
from src.algorithms.QDoubleDeepLearn import QLearn
from src.environments.jsbsim.JSBSimEnv import Env
from src.scenarios.deltaAttitudeControlScene import Scene
After:
from src.algorithms. QLearn import QLearn
from src.environments. xplane.XPlaneEnv import Env
from src.scenarios. sparseAttitudeControlScene import Scene
Now you can see that wee are using regular QLearn instead of the Double Deep QLearning module, sparse rewards instead of the delta module and we are now also using the XPlane Simulator in place of the JSBSim environment.
Available modules are:
Environments:
- jsbsim.JSBSimEnv
- xplane.XPlaneEnv
Algorithms:
- QLearn
- QDeepLearn
- QDoubleDeepLearn
- RandomAgent
Parameter | Default Value | Description | Format |
---|---|---|---|
logPeriod | 100 | every so many epochs the metrics will be printed into the console | int >0 |
savePeriod | 25 | every so many epochs the table/model will be saved to a file | int >0 |
pauseDelay | 0.01 | time an action is being applied to the environment | float |
logDecimals | 0 | sets decimals for np.arrays to X for printing | int >=0 |
n_epochs | 50_000 | Number of generations | int >0 |
n_steps | 1_000 | Number of steps per generation | int >0 |
n_actions | 4 | Number of possible actions to choose from | int >0 |
n_states | 182 | Number of states for non-Deep QLearning | int >0 |
gamma | 0.95 | The discount rate - between 0 an 1! if = 0 then no learning, ! The higher it is the more the new q will factor into the update of the q value | float 0>=1 |
lr | 0.0001 | Learning Rate. Deep ~0.0001 / non-Deep ~0.01 - If LR is 0 then the Q value would not update. The higher the value the quicker the agent will adopt the NEW Q value. If lr = 1, the updated value would be exactly be the newly calculated q value, completely ignoring the previous one | float |
epsilon | 1.0 | Starting Epsilon Rate, affects the exploration probability. Will decay | float 0>=1 |
decayRate | 0.00001 | Rate at which epsilon will decay per step | float |
epsilonMin | 0.1 | Minimum value at which epsilon will stop decaying | float 0>=1 |
n_epochsBeforeDecay | 10 | number of games to be played before epsilon starts to decay | int >=0 |
numOfInputs | 7 | Number of inputs fed to the model | int >0 |
stateDepth | 1 | Number of old observations kept for current state. State will consist of s(t) ... s(t_n) | int >0 |
minReplayMemSize | 1_000 | min size determines when the replay will start being used | int >0 |
replayMemSize | 100_000 | Max size for the replay buffer | int >0 |
batchSize | 256 | Batch size for the model | int >0 |
updateRate | 5 | update target model every so many episodes | int >0 |
loadModel | False | will load "model.h5" for tf if True (model.npy for non-Deep) | boolean |
loadMemory | False | will load "memory.pickle" if True | boolean |
loadResults | False | will load "results.npy" if True | boolean |
jsbRender | False | will send UDP data to flight gear for rendering if True | boolean |
jsbRealTime | False | will slow down the physics to portrait real time rendering | boolean |
usePredefinedSeeds | False | Sets seeds for tf, np and random for more replicable results (not fully replicable due to stochastic environments) | boolean |
saveResultsToPlot | False | Saves results to png in the experiment folder at runetime | boolean |
saveForAutoReload | False | Saves and overrides models, results and memory to the root | boolean |
startingVelocity | 60 | The starting velocity at which the plane is reset | int >0 |
startingPitchRange | 10 | The range in which the random starting pitch angle will be selected from | int >0 |
startingRollRange | 15 | The range in which the random starting roll angle will be selected from | int >0 |
randomDesiredState | True | Set a new state to stabalize towards every episode | boolean |
desiredPitchRange | 5 | The range in which the random desired pitch angle will be selected from | int >0 |
desiredRollRange | 5 | The range in which the random desired pitch angle will be selected from | int >0 |
movingRate | 100 | gives the number by which the moving average will be done, best if n * savePeriod | int >0 |
A VirtualBox image with QPlane on it can be downloaded here. (Password: jd)
Grilbreth: sbatch -t 7-00:00:00 --nodes=1 --gpus=2 --mem=32G --cpus-per-gpu=2 -A partner qplane.sub