# Project: Train a Quadcopter How to Fly

Design an agent that can fly a quadcopter, and then train it using a reinforcement learning algorithm of your choice! Try to apply the techniques you have learnt, but also feel free to come up with innovative ideas and test them.

## Task 1: Takeoff

### Implement takeoff agent

Train your agent so that it learns to successfully lift off from the ground. In order to do that, modify the `update()` method in `src/quad_controller_rl/rl_agent.py`, and any other supporting methods that might be necessary, to implement your reinforcement learning algorithm.

The default set point (target) is 10 units above the floor. And the reward function is essentially the negative absolute distance from that set point (upto some threshold). See controller code (`src/quad_controller_rl/rl_controller.py`):

```python
reward = -min(abs(self.target - position), 20.0)
```

This is primarily meant for the Hover task (next), but should work for Takeoff as well.

### Plot episode rewards

Plot the episode rewards, either from a single run, or averaged over multiple runs.

In [None]:
# TODO: Read and plot episode rewards

## Task 2: Hover

### Implement hover agent

Now, your agent must take off and hover at the specified set point (default: 10 units above the floor). Same as before, modify the `update()` method (and any other supporting methods) to implement your reinforcement learning algorithm.

### States and rewards

In order for the agent to learn more efficiently, you may need to change the state representation you pass in (e.g. include acceleration, not just position and gravity), how the rewards are computed, etc. You can do this in the controller (`src/quad_controller_rl/rl_controller.py`).

**Q**: Did you change the state representation or reward function in the controller? If so, please explain below what worked best for you, and why you chose that scheme. Include short code snippet(s) if needed.

**A**: 

### Implementation notes

**Q**: Discuss your implementation below briefly, using the following questions as a guide:

- What algorithm(s) did you try? What worked best for you?
- What was your final choice of hyperparameters (such as $\alpha$, $\gamma$, $\epsilon$, etc.)?
- What neural network architecture did you use (if any)? Specify layers, sizes, activation functions, etc.

**A**:

### Plot episode rewards

As before, plot the episode rewards, either from a single run, or averaged over multiple runs.

In [None]:
# TODO: Read and plot episode rewards

## Task 3: Landing

What goes up, must come down! But safely!

### Implement landing agent

This time, you will need to edit the starting state of the quadcopter to place it at a position above the floor (at least 10 units). And change the reward function to make the agent learn to settle down gently. For this purpose, you may need to edit the controller node (`scripts/rl_controller_node`, which is also a Python file).

### Initial condition, states and rewards

**Q**: Did you change the initial condition (starting state), state representation and/or reward function? If so, please explain below what worked best for you, and why you chose that scheme.

**A**: 

### Implementation notes

**Q**: Discuss your implementation below briefly, using the same questions as before to guide you.

**A**:

### Plot episode rewards

As before, plot the episode rewards, either from a single run, or averaged over multiple runs.

In [None]:
# TODO: Read and plot episode rewards

## Task 4: Combined

In order to design a complete flying system, you will need to incorporate all these basic behaviors into a single agent.

### Setup end-to-end task

The end-to-end task we are considering here is simply to takeoff, hover in-place for some duration, and then land. You will need to update the controller node (`scripts/rl_controller_node`) and controller (`src/quad_controller_rl/rl_controller.py`) to setup this task.

**Q**: What changes did you make to setup the task? Explain briefly.

**A**:

### Implement combined agent

Using your end-to-end task, implement the combined agent so that it learns to takeoff, hover and gently come back to ground level.

### Combination scheme and implementation notes

Now, it's up to you whether you want to train three separate (sub-)agents, or a single agent for the complete end-to-end task.

**Q**: What did you end up doing? What challenges did you face, and how did you resolve them? Discuss any other implementation notes below.

**A**:

### Plot episode rewards

As before, plot the episode rewards, either from a single run, or averaged over multiple runs.

In [None]:
# TODO: Read and plot episode rewards

## Reflections

**Q**: Briefly summarize your experience working on this project. You can use the following prompts for ideas.

- What was the hardest part of the project? (e.g. getting started, running ROS, plotting, specific task, etc.)
- How did you approach each task and choose an appropriate algorithm/implementation for it?
- Did you find anything interesting in how the quadcopter or your agent behaved?

**A**:
