# Chapter 15 - Generating Data

## 1 - Plotting a simple line graph

**15-1**. Cubes: A number raised to the third power is a cube. Plot the first five
cubic numbers, and then plot the first 5,000 cubic numbers.

In [None]:
# First part, the first five cubic numbers
import matplotlib.pyplot as plt

x_values = range(1,6)
y_values = [ x**3 for x in x_values]

# Generating the graph.
plt.style.use('seaborn-v0_8')
fit, ax = plt.subplots()
ax.scatter(x_values, y_values, c=y_values, cmap=plt.cm.Wistia, s=100)
ax.set_title("Cubes", fontsize=24)
ax.set_xlabel("Values", fontsize=14)
ax.set_ylabel("Cube of Values", fontsize=14)
ax.tick_params(labelsize=14)

plt.show()

In [None]:
# Second part, the first 5_000 cubic numbers.
import matplotlib.pyplot as plt

x_values = range(1, 5_001)
y_values = [x ** 3 for x in x_values]

plt.style.use('seaborn-v0_8-dark')
fig, ax = plt.subplots()
ax.scatter(x_values, y_values, c=y_values, cmap=plt.cm.Reds, s=10)
ax.set_title("Cubes", fontsize=24)
ax.set_xlabel("Values", fontsize=14)
ax.set_ylabel("Cube of Values", fontsize=14)
ax.tick_params(labelsize=14)
ax.axis([0, 5_100, 0, 1.3e11])

plt.show()

**15-2**. Colored Cubes: Apply a colormap to your cubes plot.

*Done*.

## 2 - Random Walk

**15-3**. Molecular Motion: Modify ``rw_visual.py`` by replacing ``ax.scatter()`` with ``ax.plot()``. To simulate the path of a pollen grain on the surface of a drop of water, pass in the ``rw.x_values`` and ``rw.y_values``, and include a ``linewidth`` argument. Use 5,000 instead of 50,000 points to keep the plot from being too busy.

In [None]:
import matplotlib.pyplot as plt

from random_walk import RandomWalk

# Generating a random walk.
rw = RandomWalk()
rw.fill_walk()

# Generating the plot.
fig, ax = plt.subplots(figsize=(10, 10))
point_number = range(rw.num_points)
ax.plot(rw.x_values, rw.y_values, linewidth=1)
ax.scatter(0, 0, c='orange')
ax.scatter(rw.x_values[-1], rw.y_values[-1], c='red')
ax.set_aspect('equal')

ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

plt.show()

**15-4**. Modified Random Walks: In the RandomWalk class, ``x_step`` and ``y_step`` are generated from the same set of conditions. The direction is chosen randomly from the list ``[1, -1]`` and the distance from the list ``[0, 1, 2, 3, 4]``. Modify the values in these lists to see what happens to the overall shape of your walks. Try a longer list of choices for the distance, such as 0 through 8, or remove the −1 from the ``x-`` or ``y-direction`` list.

*Done*. Changing the distance to something like ``x_distance = choice(range(9))`` does not affect the overall visualization too much, since with 5000 points the steps are too small to notice the variation in magnitude. If we change the directions on any axis to, say, ``[1, 0]``, the walk can't go back, all the steps in that axis will be in just one direction, so the scatter graph will look like a line. The plot graph won't change too much.

**15-5**. Refactoring: The ``fill_walk()`` method is lengthy. Create a new method called ``get_step()`` to determine the direction and distance for each step, and then calculate the step. You should end up with two calls to ``get_step()`` in ``fill_walk()`` :
```
x_step = self.get_step()
y_step = self.get_step()
```
This refactoring should reduce the size of ``fill_walk()`` and make the method easier to read and understand.

*Done*.

## 3 - Rolling Dice with Plotly

**15-6**. Two D8s: Create a simulation showing what happens when you roll two eight-sided dice 1,000 times. Try to picture what you think the visualization will look like before you run the simulation, then see if your intuition was correct. Gradually increase the number of rolls until you start to see the limits of your system’s capabilities.

In [None]:
import plotly.express as px
from die import Die

die_1 = Die(8)
die_2 = Die(8)

results = []
for roll_num in range(1000):
    results.append(die_1.roll() + die_2.roll())

frequencies = []
max_result = die_1.num_sides + die_2.num_sides
possible_results = range(2, max_result + 1)
for value in possible_results:
    frequencies.append(results.count(value))

title = "Results of Rollig two D8 1000 times."
labels = { 'x' : 'Result', 'y' : 'Frequency of Result' }
fig = px.bar(x=possible_results, y=frequencies, title=title, labels=labels)
fig.update_layout(xaxis_dtick=1)

fig.show()

**15-7**. Three Dice: When you roll three D6 dice, the smallest number you can roll is 3 and the largest number is 18. Create a visualization that shows what happens when you roll three D6 dice.

In [None]:
import plotly.express as px
from die import Die

die_1 = Die()
die_2 = Die()
die_3 = Die()

results = []
for roll_num in range(1_000_000):
    results.append(die_1.roll() + die_2.roll() + die_3.roll())

frequencies = []
max_result = die_1.num_sides + die_2.num_sides + die_3.num_sides
possible_results = range(3, max_result + 1)
for value in possible_results:
    frequencies.append(results.count(value))

title = "Results of Rollig three D6 1,000,000 times."
labels = { 'x' : 'Result', 'y' : 'Frequency of Result' }
fig = px.bar(x=possible_results, y=frequencies, title=title, labels=labels)
fig.update_layout(xaxis_dtick=1)

fig.show()

**15-8**. Multiplication: When you roll two dice, you usually add the two numbers together to get the result. Create a visualization that shows what happens if you multiply these numbers by each other instead.

In [None]:
import plotly.express as px
from die import Die

die_1 = Die()
die_2 = Die()

results = []
for roll_num in range(100_000):
    # Note: we can't get any prime number between 2 and 12.
    results.append(die_1.roll() * die_2.roll())

frequencies = []
max_result = die_1.num_sides + die_2.num_sides
possible_results = range(2, max_result + 1)
for value in possible_results:
    frequencies.append(results.count(value))

title = "Results of rollig two D6 100000 times, but multiplying."
labels = { 'x' : 'Result', 'y' : 'Frequency of Result' }
fig = px.bar(x=possible_results, y=frequencies, title=title, labels=labels)
fig.update_layout(xaxis_dtick=1)

fig.show()

**15-9**. Die Comprehensions: For clarity, the listings in this section use the long form of for loops. If you’re comfortable using list comprehensions, try writing a comprehension for one or both of the loops in each of these programs.

In [None]:
import plotly.express as px
from die import Die

die_1 = Die()
die_2 = Die()

results = [die_1.roll() + die_2.roll() for n in range(1000)]

max_result = die_1.num_sides + die_2.num_sides
possible_results = range(2, max_result + 1)

frequencies = [results.count(value) for value in possible_results]

title = "Results of rollig two D6 100000 times."
labels = { 'x' : 'Result', 'y' : 'Frequency of Result' }
fig = px.bar(x=possible_results, y=frequencies, title=title, labels=labels)
fig.update_layout(xaxis_dtick=1)

fig.show()

**15-10**. Practicing with Both Libraries: Try using Matplotlib to make a die-rolling visualization, and use Plotly to make the visualization for a random walk. (You’ll need to consult the documentation for each library to complete this exercise.)

In [None]:
# First one: rolling dice with matplotlib.
import matplotlib.pyplot as plt
from die import Die

die_1 = Die()
die_2 = Die()

results = [die_1.roll() + die_2.roll() for n in range(1000)]

max_result = die_1.num_sides + die_2.num_sides
possible_results = range(2, max_result + 1)

frequencies = [results.count(value) for value in possible_results]

fig, ax = plt.subplots(figsize=(8, 6), layout='constrained')
# For discrete data, use bins=range(min, max + 2).
# For max=12, the last bins covers the range [12, 13), so we need 13, but since
# range(2, 13) does not include 13, we must do range(2, 14).
ax.hist(results, bins=range(2, 14), linewidth=0.5, edgecolor='white', color='deepskyblue')

ax.set(xlim=(2,13), xticks=range(2,13))
ax.set_title("Result of rolling two D6 1000 times.", fontsize=24)
ax.set_xlabel("Result", fontsize=14)
ax.set_ylabel("Frequency of Result", fontsize=14)
ax.tick_params(labelsize=14)

plt.show()

In [None]:
# Second, random walk with Plotly.
import plotly.express as px
from random_walk import RandomWalk

# Generating a random walk.
rw = RandomWalk()
rw.fill_walk()

title = "Random Walk"
steps = range(len(rw.x_values))

# It is possible to reverse any buit-in color scale by appending '_r' to the name.
fig = px.scatter(x=rw.x_values, y=rw.y_values, title=title, color=steps, 
                 color_continuous_scale='ice_r')

fig.show()