# Let's harvest some number of our own!
**by Serhat Çevikel**

For basic simulation, exercise and visualization purposes, it is better that we have mechanisms to create our own data series out of nothing!

## Sequences with seq()

Colon operator ":" creates simple sequences as "from" to "to" with 1 increments. Suppose we want to have more complex sequences. Such as, with different increments or providing a length or using non integer values

In [None]:
?seq

> seq(from = 1, to = 1, by = ((to - from)/(length.out - 1)),
    length.out = NULL, along.with = NULL, ...)

Get the Fahrenheit equivalent of celcius values as a sequence

In [None]:
seq(32, 212, length.out = 101)

Get values from 1 to 11 in increments of 3:

In [None]:
seq(1, 11, 3)

**EXERCISE 1:**

Get the leap years in 21th century. Two questions:

- Is 2000 a part of 21st century?
- Is 2100 a leap year?

**SOLUTION 1:**

In [None]:
pass <- readline(prompt = "Please enter the password for the solution: ")
encrypt <- "U2FsdGVkX198ZSSyA+t1DYuAmww2gW9TbGcE6fCAeTiiethARzNXtzhKGLIBk0vu HfDfmGjzJc285wGcIFH4HFZFln1pRgi2WPYTFxUInFt/m7WwnNyNqQ8o4ySiOV3F QVdBpIRmQlanyJKem+pAAetiT4Of1AmATLsffovt6mmcU57vMe+YMHrFbxd5U+18 6kk00u0MVIgeDGZ6VpxPqdRCKY5rmO+g/26NHNp6dTUt46Nzpt81Uc2OghYLf16F w+jFhGoqHfovbfDD66wC12WgU5PoWA+x/TO9ti5eKuOor2FHFkAZIq1sb8bSMrO7 EkUBrjMn7cunrTuDNpZXCgvWCWgU8ZdQ4EzoX3LRCsw="
solution <- system(sprintf("echo %s | openssl enc -md sha256 -aes-128-cbc -a -d -salt -pass pass:%s 2> /dev/null", encrypt, pass), intern = T, ignore.stderr = T)
cat(solution, sep = "\n")
eval(parse(text = solution))

## Uniform numeric values within a range

In [None]:
runif(10)

In [None]:
runif(20, -1, 1)

## Random out of normal distribution

By default mean is 0, sd is 1:

In [None]:
rnorm(10)

## Randomly select out of a set

In [None]:
sample(1:2, 10, replace = T)

In [None]:
sample(1:10, 10, replace = F)

## Deterministic randomness

The random numbers we create are not truly random but pseudo-random: The random sequences are created by following a modulo operator starting with a seed value. If we provide the starting "seed", the sequence will always be deterministic and reproducible!

In [None]:
set.seed(1000)
runif(10)

In [None]:
set.seed(1000)
runif(10)

In [None]:
set.seed(1001)
runif(10)

# Let's visualize

Now that we can create our own datasets, let's visualize the data in a simplistic way

In [None]:
set.seed(1000)
series_1 <- runif(100, -10, 10)
series_1

In [None]:
set.seed(1200)
series_2 <- runif(100, -10, 10)
series_2

In [None]:
series_3 <- series_1 + series_2

Now make a scatterplot between series_1 and series_3:

In [None]:
plot(series_1, series_3)

We may change the labs, the plot title, color of markers and type of markers:

In [None]:
plot(series_1, series_3, pch=5, col="blue", xlab="x observations", ylab="y observations")
title("Weight vs. height")

## Line plots

Let's generate our own stock price sequence

First log returns:

In [None]:
logret <- rnorm(100, 0, 0.01)
logret

Convert to e^x:

In [None]:
logexp <- exp(logret)
logexp

Get cumulative products:

In [None]:
logcum <- cumprod(logexp)
logcum

And let's plot:

In [None]:
plot(1:100, logcum, type = "l")

Seems to walk not so randomly!

## Histograms

Create normal distributed numbers:

In [None]:
vec_norm <- rnorm(100, 10, 2)
vec_norm

Create a histogram:

In [None]:
hist(vec_norm)

Default breaks are 6 to 14 in wholenumbers

We may instruct to create fewer or more bins by bin count:

In [None]:
hist(vec_norm, 5)

In [None]:
hist(vec_norm, 20)

Or explicitly tell the cutting points of bins:

In [None]:
hist(vec_norm, seq(5, 15, by = 0.5))

# Exercises

**EXERCISE 2:**

Write an R expression that simulates the outcome of the 6/49 Lottery (Sayısal Loto), where one draws 6 numbers from 1, 2, ..., 49. Note that the same number cannot appear twice in one drawing.

**SOLUTION 2:**

In [None]:
pass <- readline(prompt = "Please enter the password for the solution: ")
encrypt <- "U2FsdGVkX19sWJ9lF7qpFxDONhnKoLyuytToYxAZEIoImCCQZUHxFiUG3Lelli80"
solution <- system(sprintf("echo %s | openssl enc -md sha256 -aes-128-cbc -a -d -salt -pass pass:%s 2> /dev/null", encrypt, pass), intern = T, ignore.stderr = T)
cat(solution, sep = "\n")
eval(parse(text = solution))

**EXERCISE 3:**

Generate 1000 random numbers, drawn from the normal distribution with standard deviation 2, and another 1000 with standard deviation 0.5.

Plot the histogram for each set of numbers. What can you say about the effect of the standard deviation?

**SOLUTION 3:**

In [None]:
pass <- readline(prompt = "Please enter the password for the solution: ")
encrypt <- "U2FsdGVkX1/bD2yBw48w9AUJEo/8FuDWVGU31+8rfEJR3A0/7Ua7vNNizQ3KiCm1 42w0twVHkHvRWCKZJLd8tw2OonG6/j+hujCeahlHX2rOjhr2+Arfs+s6Lta5Aa9+ YsEoxgQPEGGXowd2JVwpYy3h9Zl6vOk9T6lopyJY3T0mHTHZ8TERgdCixKpn4hql LkAm5Ec+cDUauWY/cZChx7scxYTnb5eLsx66YWFtm7fPDVMbxeB4flXam9CAXHOP Zlg+wpFRmcmBpOpvPeVGlg=="
solution <- system(sprintf("echo %s | openssl enc -md sha256 -aes-128-cbc -a -d -salt -pass pass:%s 2> /dev/null", encrypt, pass), intern = T, ignore.stderr = T)
cat(solution, sep = "\n")
eval(parse(text = solution))

**EXERCISE 4:**

Throw 10 coins and count the number of heads.

Repeat this experiment ten times, and find the mean of the number of heads.

**SOLUTION 4:**

In [None]:
pass <- readline(prompt = "Please enter the password for the solution: ")
encrypt <- "U2FsdGVkX1/jUdM7nJMTxHeUDi2lJkTiDgyjxeUW0T4panxKji0H1vOxZyms4b3J xTx8gk+9sZEK0i+Nuu2MJP8ozSXjfEDuQsSf/5A6kYY="
solution <- system(sprintf("echo %s | openssl enc -md sha256 -aes-128-cbc -a -d -salt -pass pass:%s 2> /dev/null", encrypt, pass), intern = T, ignore.stderr = T)
cat(solution, sep = "\n")
eval(parse(text = solution))

**EXERCISE 5:**

Throw 3 dice 10000 times. Plot the histogram of the outcomes (outcomes should be between 3 and 18).

Hint: You can use sample, cumsum, seq and the exercise 5 from CMPE_140_02_PS! 

In [None]:
pass <- readline(prompt = "Please enter the password for the solution: ")
encrypt <- "U2FsdGVkX1/WJMccpnyNqQTS5ZEm4yiinJDeWgBMRNXd1izrG6xVXSGQU4qtRffK xzxXlkNf1kT6/o3ma9UPRH7LeVts1vxoVgJ6s2Xf3vxiU4dBQET43ethyCKFeQQ7 yTMO6PZCdcL0Up1U+DAYXIoBSXHLLxOfeJa+mPDs4mHvp3slZnLxZboXjBtQxkUo jrj2GvdxDY4h6SVLAynuRzwUBbqY7BJZAmICbDMu9MjPXa4VpZh+S4s0veYyX6Y0 qxs2A+SnoEDYL08KPFqZ03HW32quZx5UJhEpScSpzGryL2QY6PJ9RZhWp7w4D/3H QMJEsncM3a/7QZgHKYmmnwNuT8I3WpPhZ7v6zQgNZIy0kfTvXri5sqi1LsNV1fOq hOZg6ZxPAHAGB9ohIg1eWvsUqGiMHppxaiQrO+KNCa70EdG7UpMt7wfu0IQvFuM+ qwwFXd9q02iT2O4cportJzxwj3t5RUCB98SivBLQlh5QlAhDwaXrPqxE1y2SjFMR k8Ibua58hLQWiEbBL74sU/Jy7X7DHkikLLx46Px2o5Bv8o7mdCDkJTN9uvPAKnFZ 2ntXmFlINVKid0VFmOSV/RED5MFXsRGY9osKlkGFuv/y6vBr7UJKPfvSomfx5Uw+ q+PIzE9Stpl42fLgz5aBb4B5rKfO42DLU/mDBzjyOwR7fxdmZDfCA3OGcBOrjLSa 67PneoqgN5EbLZTGFniS7fqJ4DUlCS4FYR7madpPfuANodpvHzw2vOXbWPvZx3ms YFZHtO6sJ7lp6lAPCnGj0O7LfuTvVL+7fE1gKKSfVgwyDQiCqzvUN0ocpLXl2SaS B8bArsYFjm18tHqkSPhch4J2GedJry8TXlYvKq1LRhxOswvukTMEtbjBY1yd4xC/"
solution <- system(sprintf("echo %s | openssl enc -md sha256 -aes-128-cbc -a -d -salt -pass pass:%s 2> /dev/null", encrypt, pass), intern = T, ignore.stderr = T)
cat(solution, sep = "\n")
eval(parse(text = solution))