# what will we learn when _we make calendars in pandas?_

below we are going to make calendars using pandas and explore the pandas api while we're at it. we'll explore different parts of thr `pandas` api like: 
* `pandas.date_range` to begin our work with the calendar
* `pandas.DataFrame.assign` to set new columns on our dataframe
* `df.apply(pandas.Series)` to wide dataframes on container elements
* `pandas.DataFrame.unstack` to change the shape of our dataframe by translating row indexes to columns.
* `pandas.DataFrame.style` to use a dataframe to provide cell level styling in our output.
* we used `pandas.DataFrame.groupby` in a for loop, as an iterator, which is quite a powerful technique for inspecting grouped operations
<!-- TEASER_END -->

In [1]:
    import pandas

to start with, we'll use `pandas` date/time tooling to construct the days of the year.

`pandas.date_range` is basically a smart `range` function for dates, we've defined the steps to be daily between the start and stop dates. it is really wise in `pandas` to work with your time and dates using their `pandas` types, there are better api affordances that remaining in an integer timestamp or time formatted string.

In [2]:
    start, stop = "2020-12-28", "2021-12-31"
    dates: pandas.Index =  pandas.date_range(start, stop, freq="D")
    F"our dates index has {len(dates)} days" 

'our dates index has 369 days'

it would really if our `calendar` demonstrated something practical which is why we'll import the third party `holidays` library; the calendars we draw will highlight holidays.

    pip install holidays

In [3]:
    import holidays

In [4]:
    F"""there are {len([y for x, y in vars(holidays).items() if isinstance(y, type) and issubclass(y, holidays.HolidayBase)])} different calendars included in `holidays`"""

'there are 262 different calendars included in `holidays`'

for demonstration we'll restrict our holidays to just the `us` for now.

In [5]:
    us: holidays.HolidayBase = holidays.UnitedStates()

to start with we'll provide the form of our indexes and columns of our `DataFrame`

In [6]:
    df: pandas.DataFrame = pandas.DataFrame(None, dates.rename("date"), "year week month day dayofweek holiday".split())
    df.dropna()

Unnamed: 0_level_0,year,week,month,day,dayofweek,holiday
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1


we'll introduce the `pandas.DataFrame.assign` method that lets you assign new columns using keyword parameters. this is just a way another way of setting items using normal python syntax

In [7]:
    df = df.assign(year=dates.year)
    df.head(2)

Unnamed: 0_level_0,year,week,month,day,dayofweek,holiday
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2020-12-28,2020,,,,,
2020-12-29,2020,,,,,


multiple columns can be assigned at once, and below we'll fill out the rest out dataset.

In [8]:
    df = df.assign(
        month=dates.month, 
        dayofweek=dates.dayofweek, 
        day=dates.day, 
        holiday=df.index.map(us.__contains__)
    )
    df.head(2)

Unnamed: 0_level_0,year,week,month,day,dayofweek,holiday
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2020-12-28,2020,,12,28,0,False
2020-12-29,2020,,12,29,1,False


`pandas` api has changed and we need to use the new [`pandas.Timestamp.isocalendar`](https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.isocalendar.html) to expand a datetime index into the ISO year, week number, and weekday. which is the new way to access the weekofyear; direct access to `dates.index.weekofyear` is being deprecated.

to get the day of the week we prepare an the isocalendar and assign the week column.

In [9]:
    iso: pandas.DataFrame = dates.isocalendar()
    df = df.assign(
        week=iso.week
    )
    df.sample(2)

Unnamed: 0_level_0,year,week,month,day,dayofweek,holiday
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2021-02-07,2021,5,2,7,6,False
2021-06-22,2021,25,6,22,1,False


for a `calendar` we'll want to group days into months. which we can acheive with a few steps in `pandas`.

it is really hard to avoid importing `numpy`, and we will use the standard library `calendar` module zhoosh up our calendars.

In [10]:
    import numpy, calendar

`calendar` provides localized names for the days and months.

In [11]:
    month_names, day_names = list(calendar.month_name), list(calendar.day_name)

display the calendars

we're near the place where we can present our calendars. all we need to do is iterate through the months

In [12]:
    df = df.set_index(["year", "month", "week", "dayofweek"])
    df.sample(2)

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,day,holiday
year,month,week,dayofweek,Unnamed: 4_level_1,Unnamed: 5_level_1
2021,10,39,4,1,False
2021,9,37,0,13,False


`style_calendar` is meant to style a single month dataframe. we'll discuss the design choices following the code.

In [13]:
    background = {
        True: """background-color: yellow; color: black;""",
        False: """background-color: purple;""",
        numpy.nan: """font-size: 0px;"""
    }

for each day, we need to align it with a holiday. we're going to acheive this by unstacking the `dayofweek` which moves the row index to columns, and creates a multiindex. this index of `months` and `days` same so that access to their values are natural later on.

In [14]:
    months = df.unstack("dayofweek")
    days = months["day"].fillna(0).astype(int)

what follows is kind of a holiday calendar using the pandas style attribute.

In [15]:
    def bg_color(x):
        return months["holiday"].loc[x.name].apply(background.get)

In [16]:
    days.sample(3).style.apply(
        lambda x: months["holiday"].loc[x.name].apply(background.get), axis=1
    )

Unnamed: 0_level_0,Unnamed: 1_level_0,dayofweek,0,1,2,3,4,5,6
year,month,week,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2021,12,49,6,7,8,9,10,11,12
2021,2,6,8,9,10,11,12,13,14
2021,5,18,3,4,5,6,7,8,9


personally, i'd like a little more control over the composition of our calendars. we'll do this by mixing the `IPython` display objects and pandas.

style each month using the `pandas.DataFrame.style` attribute

In [17]:
    monthly = dict(
        (i, g.style.hide_index().apply(bg_color, axis=1))
        for i, g in days.groupby(["year", "month"])
    )
    monthly

{(2020, 12): <pandas.io.formats.style.Styler at 0x7f0b3b46b1c0>,
 (2021, 1): <pandas.io.formats.style.Styler at 0x7f0b3b46b370>,
 (2021, 2): <pandas.io.formats.style.Styler at 0x7f0b3b46b4f0>,
 (2021, 3): <pandas.io.formats.style.Styler at 0x7f0b3b46b670>,
 (2021, 4): <pandas.io.formats.style.Styler at 0x7f0b3b46b9a0>,
 (2021, 5): <pandas.io.formats.style.Styler at 0x7f0b3b46bb80>,
 (2021, 6): <pandas.io.formats.style.Styler at 0x7f0b3b46bd60>,
 (2021, 7): <pandas.io.formats.style.Styler at 0x7f0b3b46bf40>,
 (2021, 8): <pandas.io.formats.style.Styler at 0x7f0b3b3b7160>,
 (2021, 9): <pandas.io.formats.style.Styler at 0x7f0b3b454f10>,
 (2021, 10): <pandas.io.formats.style.Styler at 0x7f0b3b454e50>,
 (2021, 11): <pandas.io.formats.style.Styler at 0x7f0b3b454970>,
 (2021, 12): <pandas.io.formats.style.Styler at 0x7f0b3b454d00>}

construct raw html beginning with the `monthly` styled dataframes, and combine them into a parent container with the selector `div.calendars`. we put a heading for each month.

In [18]:
    HTML("""<div class="calendars">%s</div>""" %"\n".join([
        """<div class="month"><h2>%s %i</h2>%s</div>"""%(
            month_names[i[1]], i[0], monthly[i].render())
        for i in monthly
    ]))

0,1,2,3,4,5,6
28,29,30,31,0,0,0

0,1,2,3,4,5,6
4,5,6,7,8,9,10
11,12,13,14,15,16,17
18,19,20,21,22,23,24
25,26,27,28,29,30,31
0,0,0,0,1,2,3

0,1,2,3,4,5,6
1,2,3,4,5,6,7
8,9,10,11,12,13,14
15,16,17,18,19,20,21
22,23,24,25,26,27,28

0,1,2,3,4,5,6
1,2,3,4,5,6,7
8,9,10,11,12,13,14
15,16,17,18,19,20,21
22,23,24,25,26,27,28
29,30,31,0,0,0,0

0,1,2,3,4,5,6
0,0,0,1,2,3,4
5,6,7,8,9,10,11
12,13,14,15,16,17,18
19,20,21,22,23,24,25
26,27,28,29,30,0,0

0,1,2,3,4,5,6
0,0,0,0,0,1,2
3,4,5,6,7,8,9
10,11,12,13,14,15,16
17,18,19,20,21,22,23
24,25,26,27,28,29,30
31,0,0,0,0,0,0

0,1,2,3,4,5,6
0,1,2,3,4,5,6
7,8,9,10,11,12,13
14,15,16,17,18,19,20
21,22,23,24,25,26,27
28,29,30,0,0,0,0

0,1,2,3,4,5,6
0,0,0,1,2,3,4
5,6,7,8,9,10,11
12,13,14,15,16,17,18
19,20,21,22,23,24,25
26,27,28,29,30,31,0

0,1,2,3,4,5,6
0,0,0,0,0,0,1
2,3,4,5,6,7,8
9,10,11,12,13,14,15
16,17,18,19,20,21,22
23,24,25,26,27,28,29
30,31,0,0,0,0,0

0,1,2,3,4,5,6
0,0,1,2,3,4,5
6,7,8,9,10,11,12
13,14,15,16,17,18,19
20,21,22,23,24,25,26
27,28,29,30,0,0,0

0,1,2,3,4,5,6
0,0,0,0,1,2,3
4,5,6,7,8,9,10
11,12,13,14,15,16,17
18,19,20,21,22,23,24
25,26,27,28,29,30,31

0,1,2,3,4,5,6
1,2,3,4,5,6,7
8,9,10,11,12,13,14
15,16,17,18,19,20,21
22,23,24,25,26,27,28
29,30,0,0,0,0,0

0,1,2,3,4,5,6
0,0,1,2,3,4,5
6,7,8,9,10,11,12
13,14,15,16,17,18,19
20,21,22,23,24,25,26
27,28,29,30,31,0,0


using the prior html output, and `div.calendars` as a reference we use the modern css [`grid` display layout](https://css-tricks.com/snippets/css/complete-guide-grid/). this results in the calendars being reorganized by the browser.

In [19]:
    HTML(
        """<style>
        .calendars {
            display: grid;
            grid-template-columns: 1fr 1fr 1fr;
        }
        </style>"""
    )

## recap

we've managed most of your goal using `pandas` until we needed to customize the display, at that point we combined knowledge of __html__ and __css__ to gain extra control over our display. we used a few interesting `pandas.DataFrame`  features like:

* `pandas.date_range` to begin our work with the calendar
* `pandas.DataFrame.assign` to set new columns on our dataframe
* `df.apply(pandas.Series)` to wide dataframes on container elements
* `pandas.DataFrame.unstack` to change the shape of our dataframe by translating row indexes to columns.
* `pandas.DataFrame.style` to use a dataframe to provide cell level styling in our output.
* we used `pandas.DataFrame.groupby` in a for loop, as an iterator, which is quite a powerful technique for inspecting grouped operations