In [1]:
import pandas as pd
import numpy as np

In [2]:
rng = pd.date_range('1/1/2011', periods=72, freq='H')
ts = pd.Series(np.random.randn(len(rng)), index=rng)

In [3]:
converted = ts.asfreq('45Min', method='pad')

In [5]:
# Does asfreq change the # of rows?
print(ts.shape)
print(converted.shape)

(72,)
(95,)


_Reply:_ Changing the frequency does change the number of rows. Since we've increased the frequency from every hour to every 45 minutes, we've added data, which is included in 23 new rows.

### What do the different methods do?
method : {‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}

_Reply:_ These methods fill in the data for those new rows. The data does not originally contain any information for what data should be included in the additional rows that we just added. More to the point, since the origional data was point-in-time data (time stamps), we have no guiding data. The back- and forward-fill methods tell pandas how fill the data in the new rows.

### Might any of these methods have pitfalls from a logical point of view?

_Reply:_ As discussed in the video for this assignment, back-filling and interpolation have the risk of using future data now, creating the situation where we may be using future data to forecast the future.

### What's the difference between going to a higher frequency and a lower frequency?

_Reply:_ A higher frequency adds data but necessitates decisions about how to fill in the data for the new time stamps. A lower frequency will more likely to simply remove data, which does more to protect the integrity of the data.

In [7]:
converted = ts.asfreq('90Min', method = 'bfill')
converted.shape

(48,)

### What's different logically about going to a higher frequency vs a lower frequency? 
What do you want to do when switching to a lower freqeuncy that is not logical when switching to a higher frequency?

_Reply:_ Logically, the two are analogous to disaggregating and aggregating, repectively. Changing to a lower frequency is essentially pooling your existing data. In this case, you can now look into interpolation between data points to fill in data or to backfill your data from future points, since that time is now part of the less frequent sampling times. This was how you pushed future data into the past when going to a higher frequency but is appropriate when going to a lower frequency.

In [10]:
ts.resample('D').sum()

2011-01-01   -2.364683
2011-01-02   -7.152354
2011-01-03    0.403373
Freq: D, dtype: float64

### What if you want to downsample and you don't want to ffill or bfill?

_Reply:_ Leave the method as `None`, but that will leave data as `NaN`.

### What is the difference between .resample() and .asfreq()?

_Reply:_ Resampling is a more systematic method, while changing the frequency is, as the video indicates, for 'fast and loose changes' in frequency.

### What are some special things you can do with .resample() you can't do with .asfreq()?

One of the things you can do is to apply the `fillna()` method to forward- or backfill data for limited amounts of time. You can also run different data aggregation operations, such as finding means and sums of observations during different timeframes.