"""
Wind Statistics

---

Topics: Using array methods over different axes, fancy indexing.

1.  The data in 'wind.data' has the following format::

         61  1  1 15.04 14.96 13.17  9.29 13.96  9.87 13.67 10.25 10.83 12.58 18.50 15.04
         61  1  2 14.71 16.88 10.83  6.50 12.62  7.67 11.50 10.04  9.79  9.67 17.54 13.83
         61  1  3 18.50 16.88 12.33 10.13 11.17  6.17 11.25  8.04  8.50  7.67 12.75 12.71

    The first three columns are year, month and day. The
    remaining 12 columns are average windspeeds in knots at 12
    locations in Ireland on that day.

    Use the 'loadtxt' function from numpy to read the data into
    an array.

2.  Calculate the min, max and mean windspeeds and standard deviation of the
    windspeeds over all the locations and all the times (a single set of numbers
    for the entire dataset).

3.  Calculate the min, max and mean windspeeds and standard deviations of the
    windspeeds at each location over all the days (a different set of numbers
    for each location)

4.  Calculate the min, max and mean windspeed and standard deviations of the
    windspeeds across all the locations at each day (a different set of numbers
    for each day)

5.  Find the location which has the greatest windspeed on each day (an integer
    column number for each day).

6.  Find the year, month and day on which the greatest windspeed was recorded.

7.  Find the average windspeed in January for each location.

You should be able to perform all of these operations without using a for
loop or other looping construct.

Bonus

```

1. Calculate the mean windspeed for each month in the dataset.  Treat
   January 1961 and January 1962 as *different* months. (hint: first find a
   way to create an identifier unique for each month. The second step might
   require a for loop.)

2. Calculate the min, max and mean windspeeds and standard deviations of the
   windspeeds across all locations for each week (assume that the first week
   starts on January 1 1961) for the first 52 weeks. This can be done without
   any for loop.

Bonus Bonus
```

Calculate the mean windspeed for each month without using a for loop.
(Hint: look at `searchsorted` and `add.reduceat`.)

Notes

```

These data were analyzed in detail in the following article:

   Haslett, J. and Raftery, A. E. (1989). Space-time Modelling with
   Long-memory Dependence: Assessing Ireland's Wind Power Resource
   (with Discussion). Applied Statistics 38, 1-50.


See :ref:`wind-statistics-solution`.
"""

from numpy import loadtxt
```


1.  The data in 'wind.data' has the following format::

         61  1  1 15.04 14.96 13.17  9.29 13.96  9.87 13.67 10.25 10.83 12.58 18.50 15.04
         61  1  2 14.71 16.88 10.83  6.50 12.62  7.67 11.50 10.04  9.79  9.67 17.54 13.83
         61  1  3 18.50 16.88 12.33 10.13 11.17  6.17 11.25  8.04  8.50  7.67 12.75 12.71

    The first three columns are year, month and day. The
    remaining 12 columns are average windspeeds in knots at 12
    locations in Ireland on that day.

    Use the 'loadtxt' function from numpy to read the data into
    an array.


In [3]:
import numpy as np
from numpy import loadtxt

data = loadtxt('wind.data')
print('shape=',data.shape)
date = data[:,:3]
print('date=', date)
wind = data[:,3:]
wind


shape= (6574, 15)
date= [[61.  1.  1.]
 [61.  1.  2.]
 [61.  1.  3.]
 ...
 [78. 12. 29.]
 [78. 12. 30.]
 [78. 12. 31.]]


array([[15.04, 14.96, 13.17, ..., 12.58, 18.5 , 15.04],
       [14.71, 16.88, 10.83, ...,  9.67, 17.54, 13.83],
       [18.5 , 16.88, 12.33, ...,  7.67, 12.75, 12.71],
       ...,
       [14.  , 10.29, 14.42, ..., 16.42, 18.88, 29.58],
       [18.5 , 14.04, 21.29, ..., 12.12, 14.67, 28.79],
       [20.33, 17.41, 27.29, ..., 11.38, 12.08, 22.08]], shape=(6574, 12))

2.  Calculate the min, max and mean windspeeds and standard deviation of the
    windspeeds over all the locations and all the times (a single set of numbers
    for the entire dataset).


In [6]:
print('max=',wind.max())
print('min=',wind.min())
print('mean=',wind.mean())
print('std=',wind.std())


max= 42.54
min= 0.0
mean= 10.22837377040868
std= 5.603840181095793


3.  Calculate the min, max and mean windspeeds and standard deviations of the
    windspeeds at each location over all the days (a different set of numbers
    for each location)


In [7]:
print('max of each location=',wind.max(axis=0))
print('min of each location=',wind.min(axis=0))
print('mean of each location=',wind.mean(axis=0))
print('std of each location=',wind.std(axis=0))


max= [35.8  33.37 33.84 28.46 37.54 26.16 30.37 31.08 25.88 28.21 42.38 42.54]
min= [0.67 0.21 1.5  0.   0.13 0.   0.   0.   0.   0.04 0.13 0.67]
mean= [12.36371463 10.64644813 11.66010344  6.30627472 10.45688013  7.09225434
  9.7968345   8.49442044  8.49581838  8.70726803 13.121007   15.59946152]
std= [5.61918301 5.26820081 5.00738377 3.60513309 4.93536333 3.96838126
 4.97689374 4.49865783 4.16746101 4.50327222 5.83459319 6.69734719]


4.  Calculate the min, max and mean windspeed and standard deviations of the
    windspeeds across all the locations at each day (a different set of numbers
    for each day)


In [10]:
print('max of each day=',wind.max(axis=1))
print('min of each day=',wind.min(axis=-1))
print('mean of each day=',wind.mean(axis=-1))
print('std of each day=',wind.std(axis=-1))


max of each day= [18.5  17.54 18.5  ... 29.58 28.79 27.29]
min of each day= [9.29 6.5  6.17 ... 8.71 9.13 9.59]
mean of each day= [13.09666667 11.79833333 11.34166667 ... 14.89       15.3675
 15.4025    ]
std of each day= [2.5773188  3.28972854 3.50543348 ... 5.51175108 5.30456427 5.45971172]


5.  Find the location which has the greatest windspeed on each day (an integer
    column number for each day).


In [29]:
print(wind.shape)
max_wind = wind.argmax(axis=1)
max_wind

    


(6574, 12)


array([10, 10,  0, ..., 11, 11,  2], shape=(6574,))

6.  Find the year, month and day on which the greatest windspeed was recorded.


In [37]:
print(wind.shape)
max_wind = wind.max(axis=1)
max_wind

max_wind.argmax()
date[2161]


(6574, 12)


array([66., 12.,  2.])

7.  Find the average windspeed in January for each location.


In [46]:
jan_data= data[data[:, 1] == 1]

jan_data[:,3:].mean(axis=0)


array([14.86955197, 12.92166667, 13.29962366,  7.19949821, 11.67571685,
        8.05483871, 11.81935484,  9.5094086 ,  9.54320789, 10.05356631,
       14.55051971, 18.02876344])

Bonus

1. Calculate the mean windspeed for each month in the dataset. Treat
   January 1961 and January 1962 as _different_ months. (hint: first find a
   way to create an identifier unique for each month. The second step might
   require a for loop.)


IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

2. Calculate the min, max and mean windspeeds and standard deviations of the
   windspeeds across all locations for each week (assume that the first week
   starts on January 1 1961) for the first 52 weeks. This can be done without
   any for loop.


Bonus Bonus

Calculate the mean windspeed for each month without using a for loop.
(Hint: look at `searchsorted` and `add.reduceat`.)
