# Anime News Network "Japanese Animation TV Ranking" scraping project

## Introduction

Every week, the Japanese company Video Research reports on ratings for the most-viewed TV programs of the week. These are broken down into several categories, one of which is animation. Since 2007, the website Anime News Network has translated the animation rankings into English and posted them for readers. The archived rankings are all still available on ANN. We will attempt to compile them, to analyze them for any insights of interest.

Eventually, we will be downloading almost 20 years' worth of weekly rankings, adding to around 900 web pages. We won't need to wait for one to finish loading before requesting another, so we'll want to use an asynchronous HTTP client like in `httpx`. ANN limits clients to 60 requests every 60 seconds, so we'll use the `aiolimiter` library to keep a lid on things, keeping every request inside an `async with limiter` block. To make sure we have a bit of leeway, we'll specify our allotted time-per-request to allow only 59 requests a minute.

In [1]:
from httpx import AsyncClient
from aiolimiter import AsyncLimiter

limiter = AsyncLimiter(max_rate=1, time_period=60/59)

async def limit_get(client: AsyncClient, url: str):
    async with limiter:
        response = await client.get(url)
    response.raise_for_status()
    return response.text

With these preliminaries observed, we turn to the contents of the pages we'll be retrieving.

## Individual ranking pages

First, we must know how to extract the data from any given page. Here, we will start with the URL for an archive page, such as this:

https://www.animenewsnetwork.com/news/2024-12-21/japanese-animation-tv-ranking-december-9-15/.219307

And we will store the ranking table in a Pandas DataFrame.

Usually, this is straightforward, as there's only one `<table>` element on the page. [One post in 2015](https://www.animenewsnetwork.com/news/2015-09-17/japan-animation-tv-ranking-september-7-13/.93067) includes a summary of ratings for the recent live-action Death Note drama along with the normal table; the one we want is the second there, so we can avoid requiring special behavior by simply instructing our function to take the last table from each page.

We will, however, require special behavior in another context. In the first year ANN published this column, there were several cases where two, three, or even ten weeks' worth of ratings were included in one post. We'll want them all. This should be as simple as a call to `pd.concat()`. We could check for the specific pieces, but all are known to be in 2007 or 2008, so we can just check for those years -- the single-table pages will work fine, since `pd.concat()` works with a single DataFrame as input.

In [2]:
import io
import pandas as pd

# For future-proofing with Pandas 3.0.
pd.options.mode.copy_on_write = True

async def get_tables_from_url(client: AsyncClient, url: str):
    page = await limit_get(client, url)
    return pd.read_html(io.StringIO(page))

async def get_df_from_url(client: AsyncClient, url: str):
    dfs = await get_tables_from_url(client, url)
    if ("news/2007" in url) or ("news/2008" in url):
        return pd.concat(dfs, ignore_index=True)
    else:
        return dfs[-1]

Combining DataFrames from separate pages poses a problem, though. The broadcast dates are listed with the month and day, but not the year. The year is clear in the original context, but we will be combining rankings from over a decade, so "May 5" won't be very useful.

A number of solutions are possible, but I chose to use the `Series.replace()` function in Pandas, and get the date ANN published the ranking from its URL. Under the assumption that they wouldn't have published rankings for broadcasts more than three months prior, we will make a dictionary mapping possible date strings in the relevant formats to the appropriate date objects.

Normally, the broadcast dates follow a format such as "December 8 (Sun)". The day of week is of limited use to us, and does not follow a standard format; we will separate it into a new column for future reference.

In [3]:
def separate_date_of_week(df: pd.DataFrame):
    # A row in df['Date'] might start as "December 8 (Sun)".
    date_col_split = df['Date'].str.extract(r"(\w+ ?\d+) ?\((\w+)\)")
    new_df = df.copy()

    # "December 8"
    new_df['Date'] = date_col_split[0]
    # "Sun"
    new_df['Listed day of week'] = date_col_split[1]
    
    return new_df

We'll need a function to get the date from the URL and turn it into a Python `date` object.

In [4]:
from datetime import date
import re

def get_pub_date(url: str):
    date_string = re.search(r"/news/([-\d]+)/", url)[1]
    return date.fromisoformat(date_string)

Now to produce the dict from this date. We'll construct it from a sequence of key-value pairs, containing a string and the date that it means. These pairs can be produced with a generator.

Some data quality issues comes into play here. During testing, dates were found in at least one ranking that add a leading zero to a single-digit day-of-month, though most go without. For single-digit dates, we'll have to generate strings for both options. In hopes of keeping the code readable, this will be split into a separate generator.

Other cases were less easy to deal with, involving typos in the month or day that put the date outside the three-month range, clearly in error, or misspelled the month. For these cases, the best solution I found was to identify each page with a typo and include a special case for catching the misspelling. Potentially, this could also be used to catch dates with a leading zero, but at present it doesn't seem worthwhile.

In [5]:
from datetime import timedelta

def kvp_for_date(d: date):
    yield (f"{d:%B %d}", d)
    if d.day < 10:
        yield (f"{d:%B} {d.day}", d)

def typo_kvp(pub_date: date):
    match f"{pub_date}":
        case '2011-08-15':
            yield ('August7', date(2011, 8, 7))
        case '2015-03-30':
            yield ('December 21', date(2015, 3, 21))
        case '2015-10-08':
            yield ('December 19', date(2015, 9, 19))
        case '2016-09-24':
            yield ('Spetember 17', date(2016, 9, 17))
        case '2016-09-29':
            yield ('Spetember 24', date(2016, 9, 24))
        case '2021-05-15':
            yield ('May 25', date(2021, 5, 2))
        case '2022-01-29':
            yield ('July 23', date(2022, 1, 23))
        case '2022-02-05':
            yield ('July 30', date(2022, 1, 30))

def last_90_days_kvp(d: date):
    for i in range(90):
        yield from kvp_for_date(d - timedelta(days=i))
    yield from typo_kvp(d)

def last_90_days_dict(d: date):
    return dict(last_90_days_kvp(d))

We're ready to put it all together. We'll separate the day of week into a separate column, fix any dates that look wrong, and turn the Date column into a datetime format. While we're at it, we'll also include the first day of the week tracked (they go from Monday to Sunday) and the date Anime News Network published the ranking.

One more thing: Until the 2020s, ANN's tables included an image for the anime being broadcast; these will come out as NaN in our DataFrame. Sometimes there'll also be tables with an overlooked blank row. We'll want to remove both of these with `DataFrame.dropna()`.

In [6]:
def first_day_of_week(s: pd.Series):
    return s.dt.to_period('W').dt.start_time

def prepare_ranking_table(df: pd.DataFrame, url: str):
    df = separate_date_of_week(df)

    pub_date = get_pub_date(url)
    last_90_days = last_90_days_dict(pub_date)

    df['Date'] = df['Date'].replace(last_90_days).astype("datetime64[ns]")
    df['Publish date'] = pd.to_datetime(pub_date)
    df['First day of week'] = first_day_of_week(df['Date'])
    
    return df

async def get_prepared_df_from_url(client: AsyncClient, url: str):
    try:
        raw_df = await get_df_from_url(client, url)
        raw_df = raw_df.dropna(axis='index', how='all')
        raw_df = raw_df.dropna(axis='columns', how='all')
        return prepare_ranking_table(raw_df, url)
    except:
        print(f"Error getting DataFrame from {url}")
        raise

## Multiple ranking pages

Now, at last, we're ready to concatenate the ranking tables from separate weeks.

In [7]:
import asyncio

async def get_df_from_urls(client: AsyncClient, urls: list[str]):
    coroutines = [get_prepared_df_from_url(client, url) for url in urls]
    return pd.concat(await asyncio.gather(*coroutines), ignore_index=True)

Of course, to make use of this, we'll need the URLs for all the rankings.

ANN has archive pages where one can view links to all news pieces in a given month or year, each with the headline and the date and time published. We'll need every monthly page back to 2007, and we'll use `BeautifulSoup` to extract the links to the TV rankings. We'll start with the function to get relevant links from a single month's archive page, then one to combine those for a year's worth.

Each ANN page includes over 80 KB of overhead unrelated to the main content, so ideally, we'd want to obtain these links from the annual archive pages, to save bandwidth for both them and us. However, the ANN servers seem to have more trouble delivering 18 annual pages than 216 monthly pages in chunks of 12. This means downloading something on the order of twice as much data overall, but we can sacrifice that for the required robustness.

Over the years, the format ANN's used for the titles of these articles has varied slightly, between the likes of "Japanese Anime TV Ranking", "Japanese Animation TV Ranking", "Japan's Animation TV Ranking". We'll use a regular expression that should capture them all.

In [8]:
from urllib.parse import urljoin

from bs4 import BeautifulSoup

async def tv_rankings_for_month(client: AsyncClient, year: int, month: int) -> list[str]:
    ann_base = "https://www.animenewsnetwork.com"
    archive_page = await limit_get(client, f"{ann_base}/news/{year}/{month:02}")
    soup = BeautifulSoup(archive_page)
    return [
        urljoin(ann_base, a['href'])
        for a in soup.css.select(".article-list li a")
        if re.search(r"Anim[a-z]+ TV Ranking", a.text)
    ]

async def tv_rankings_for_year(client: AsyncClient, year: int) -> list[str]:
    today = date.today()
    max = 12 if year != today.year else today.month
    coros = [tv_rankings_for_month(client, year, m) for m in range(max, 0, -1)]
    return [url for l in await asyncio.gather(*coros) for url in l]

And now a function to do this for every year, and combine the URLs we need into one listing.

In [9]:
async def all_ranking_urls(client: AsyncClient):
    current_year = date.today().year
    archive_coros = [
        tv_rankings_for_year(client, y)
        for y in range(current_year, 2006, -1)
    ]
    return [url for l in await asyncio.gather(*archive_coros) for url in l]

One case was found where errors were introduced due to the same article being accidentally posted twice on ANN. To adjust for that, we'll de-duplicate the URLs based on the parts before the unique article ID at the end.

In [10]:
def dedup_urls(urls: list[str]) -> list[str]:
    return list({u.split('/.')[0]:u for u in urls}.values())

All we need to get started now is the client.

In [11]:
async def get_full_df_from_ann():
    async with AsyncClient(timeout=None) as client:
        urls = dedup_urls(await all_ranking_urls(client))
        return await get_df_from_urls(client, urls)

Unfortunately, testing shows this doesn't quite work as intended -- the server starts acting up again, for reasons I confess I'm not quite clear on. So instead of combining them into one function call that uses the same client for both, we'll use a separate cell for each, creating and closing a new client both times.

In [12]:
async with AsyncClient(timeout=None) as client:
    urls = dedup_urls(await all_ranking_urls(client))

print(f"{len(urls)} pages with TV rankings found.")

901 pages with TV rankings found.


In [13]:
async with AsyncClient(timeout=None) as client:
    raw_df = await get_df_from_urls(client, urls)

raw_df

Unnamed: 0,Title,Station,Date,Time,Length,Average Household Rating,Listed day of week,Publish date,First day of week
0,Sazae-san,Fuji TV,2025-02-02,18:30,30 min.,8.6,Sun,2025-02-08,2025-01-27
1,Detective Conan,NTV,2025-02-01,18:00,30 min.,6.1,Sat,2025-02-08,2025-01-27
2,Chibi Maruko-chan,Fuji TV,2025-02-02,18:00,30 min.,5.9,Sun,2025-02-08,2025-01-27
3,The Apothecary Diaries,NTV,2025-01-31,23:30,30 min.,4.2,Fri,2025-02-08,2025-01-27
4,Doraemon,TV Asahi,2025-02-01,17:00,30 min.,3.7,Sat,2025-02-08,2025-01-27
...,...,...,...,...,...,...,...,...,...
9365,Kakashi's Team Advances! Naruto: Shippuuden Sp...,TV Tokyo,2007-04-12,19:59,55 min.,5.3,Thurs,2007-04-19,2007-04-09
9366,Pururun! Shizuku-chan,TV Tokyo,2007-04-14,09:30,30 min.,5.3,Sat,2007-04-19,2007-04-09
9367,Bleach,TV Tokyo,2007-04-11,19:26,29 min.,4.9,Wed,2007-04-19,2007-04-09
9368,Oha Coliseum,TV Tokyo,2007-04-14,08:30,30 min.,4.9,Sat,2007-04-19,2007-04-09


In [14]:
raw_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9370 entries, 0 to 9369
Data columns (total 9 columns):
 #   Column                    Non-Null Count  Dtype         
---  ------                    --------------  -----         
 0   Title                     9370 non-null   object        
 1   Station                   9370 non-null   object        
 2   Date                      9370 non-null   datetime64[ns]
 3   Time                      9370 non-null   object        
 4   Length                    9370 non-null   object        
 5   Average Household Rating  9370 non-null   float64       
 6   Listed day of week        9370 non-null   object        
 7   Publish date              9370 non-null   datetime64[ns]
 8   First day of week         9370 non-null   datetime64[ns]
dtypes: datetime64[ns](3), float64(1), object(5)
memory usage: 659.0+ KB


And now to save our DataFrame as a CSV.

In [15]:
from datetime import datetime

def timestamped_csv(title: str, df: pd.DataFrame):
    df.to_csv(f"{title}_{datetime.now():%Y%m%d%H%M%S}.csv", index=False)

timestamped_csv("ann_tv_rankings_raw", raw_df)

# Post-download processing

We've got more data quality issues we want to address with this data before we can really analyze it. It's good data analysis practice to make a copy of your raw data before cleanup.

In [16]:
cleaned_df = raw_df.copy()

Let's start with a look at our `Listed day of week` column.

In [17]:
cleaned_df['Listed day of week'].value_counts()

Listed day of week
Sun       4208
Sat       2873
Fri       1281
Thurs      587
Mon        167
Wed        156
Tues        64
Thu         23
Tue          5
Sunday       4
Monday       1
Friday       1
Name: count, dtype: int64

Clearly, these are not in any consistent format. We'll want to unify them to DDD format. This is also a good case for use of a categorical data type, since there will only be seven possible string values when we're done.

In [18]:
cleaned_df['Listed day of week'] = cleaned_df['Listed day of week'].replace({
    'Thurs':  'Thu',
    'Tues':   'Tue',
    'Sunday': 'Sun',
    'Monday': 'Mon',
    'Friday': 'Fri',
}).astype('category')

We'll want to compare for accuracy, of course, by checking them against the date given.

In [19]:
day_name = cleaned_df['Date'].dt.day_name()
cleaned_df['Computed day of week'] = day_name.str[:3].astype('category')

mismatched_dow = cleaned_df['Computed day of week'] != \
    cleaned_df['Listed day of week']
n, d = mismatched_dow.sum(), len(cleaned_df)
print(f"{n} of {d} rows, or {n/d:.2%}, have mismatched days of the week.")

if n > 0:
    print("\nTop examples:")
    print(cleaned_df.loc[mismatched_dow, 'Title'].value_counts().head())

75 of 9370 rows, or 0.80%, have mismatched days of the week.

Top examples:
Title
Shin Sanjūshi (The New Three Musketeers)    9
Detective Conan                             8
Pokémon Sun & Moon                          6
Oshiri Tantei                               5
One Piece                                   3
Name: count, dtype: int64


Unfortunately, we have some errors -- and it's hard to be sure whether the date or day of week listed is correct. Consider the case of the popular anime Detective Conan:

In [20]:
cleaned_df.loc[
    (cleaned_df['Title'] == "Detective Conan"),
    ['Listed day of week', 'Computed day of week']
].value_counts()

Listed day of week  Computed day of week
Sat                 Sat                     767
Mon                 Mon                      30
                    Sat                       8
Name: count, dtype: int64

Both columns are in agreement that the series is usually broadcast on Saturday, but at times it's been broadcast on Monday. Without digging deeper, it's hard to be sure which is correct for the eight discrepancies -- where a Saturday broadcast date was listed as a Monday. We might compare an additional source with broadcast dates for error correction, or investigate each series one by one -- perhaps the regular broadcast day was changed at some point, and that knowledge can be used to pick the correct day based on whether the episode was broadcast before them or after. For now, we will leave the data as is, with a caution that the day-and-date broadcast info cannot be relied on for almost 1% of all rows.

Another sure sign of errors can be found by searching for cases of multiple programs apparently being broadcast on the same channel at the exact same time:

In [21]:
dup = cleaned_df.duplicated(subset=['Station', 'Date', 'Time'], keep=False)
cleaned_df.loc[dup]

Unnamed: 0,Title,Station,Date,Time,Length,Average Household Rating,Listed day of week,Publish date,First day of week,Computed day of week
1932,Detective Conan,NTV,2021-07-31,17:30,30 min.,4.0,Sat,2021-08-07,2021-07-26,Sat
1936,My Hero Academia,NTV,2021-07-31,17:30,30 min.,3.2,Sat,2021-08-07,2021-07-26,Sat
3380,One Piece,Fuji TV,2018-11-11,9:30,30 min.,5.6,Sun,2018-11-22,2018-11-05,Sun
3390,One Piece,Fuji TV,2018-11-11,9:30,30 min.,5.0,Sun,2018-11-15,2018-11-05,Sun
3799,Pokémon Sun & Moon,TV Tokyo,2018-01-25,18:55,30 min.,3.5,Thu,2018-02-08,2018-01-22,Thu
3808,Pokémon Sun & Moon,TV Tokyo,2018-01-25,18:55,30 min.,3.8,Thu,2018-02-01,2018-01-22,Thu
3858,Animated O-saru no George (Curious George),NHK-E,2017-12-16,08:35,25 min.,3.5,Sat,2017-12-28,2017-12-11,Sat
3870,Animated O-saru no George (Curious George),NHK-E,2017-12-16,08:35,25 min.,2.8,Sat,2017-12-21,2017-12-11,Sat
3903,Crayon Shin-chan,TV Asahi,2017-11-10,19:30,24 min.,8.3,Fri,2017-11-23,2017-11-06,Fri
3904,Detective Conan,NTV,2017-11-11,18:00,30 min.,7.8,Sat,2017-11-23,2017-11-06,Sat


In [22]:
print(f"{dup.sum() // 2} errors detected through duplicates.")

17 errors detected through duplicates.


These, again, will need to be inspected on a case-by-case basis, to determine which fields in which rows are in error.

Setting that aside for now, we'll shift our focus to the `Station` column.

In [23]:
cleaned_df['Station'].value_counts()

Station
Fuji TV        3369
TV Asahi       2229
NTV            1650
TV Tokyo       1059
NHK-E           887
TBS             104
NHK              41
NHK-G            23
NHK General       6
TV-TOKYO          1
Nippon TV         1
Name: count, dtype: int64

As with `Listed day of week`, there are a few inconsistencies in the names used for stations, and so we'll normalize them the same way.

In [24]:
cleaned_df['Station'] = cleaned_df['Station'].replace({
    'NHK-G':       'NHK',
    'NHK General': 'NHK',
    'TV-TOKYO':    'TV Tokyo',
    'Nippon TV':   'NTV',
}).astype('category')

cleaned_df['Station'].value_counts()

Station
Fuji TV     3369
TV Asahi    2229
NTV         1651
TV Tokyo    1060
NHK-E        887
TBS          104
NHK           70
Name: count, dtype: int64

All seven national terrestrial networks, two public and five commercial, are represented. TBS has the fewest appearances of the commercial networks, though it does have some standout franchises like Gundam and Jujutsu Kaisen. Of note, independent UHF stations like [Tokyo MX](https://en.wikipedia.org/wiki/Tokyo_MX) are completely absent. This could be because none of their anime have ever had enough live viewers, though it may also be that Video Research undercounts viewership rates for these stations because they have less signal power, and therefore reach fewer viewers.

For example, the anime Oshi no Ko and Mobile Suit Gundam: The Witch from Mercury were both popular in the spring of 2023. But everyone in Kanto who watched The Witch from Mercury with a TV antenna did so on TBS, while Oshi no Ko's Kanto viewers would've watched it on Tokyo MX in Tokyo, [TV Kanagawa](https://en.wikipedia.org/wiki/Television_Kanagawa) in Yokohama, [Gunma TV](https://en.wikipedia.org/wiki/Gunma_Television) in Maebashi, and so on. If Video Research counts viewers of those broadcasts separately, the viewership numbers for Oshi no Ko would be fragmented. This could explain why Oshi no Ko never charted in this time and The Witch from Mercury did, even though Google Trends [suggests](https://trends.google.com/trends/explore?date=2023-01-01%202023-12-31&geo=JP&q=%E6%8E%A8%E3%81%97%E3%81%AE%E5%AD%90,%E6%B0%B4%E6%98%9F%E3%81%AE%E9%AD%94%E5%A5%B3,%E9%AC%BC%E6%BB%85%E3%81%AE%E5%88%83,%E3%82%AC%E3%83%B3%E3%83%80%E3%83%A0&hl=en-US) there was more interest in Oshi no Ko than The Witch from Mercury at the time, or even Demon Slayer.

Next, we'll check to see that the `Average Household Rating` values are what we'd expect.

In [25]:
cleaned_df['Average Household Rating'].describe()

count    9370.000000
mean        6.691686
std         3.880626
min         0.600000
25%         3.700000
50%         5.800000
75%         8.800000
max        85.000000
Name: Average Household Rating, dtype: float64

We have a clear outlier; it's very unlikely an anime broadcast gained an 85% viewership rating in the 21st century. Let's look closer.

In [26]:
cleaned_df.sort_values(by='Average Household Rating').tail()

Unnamed: 0,Title,Station,Date,Time,Length,Average Household Rating,Listed day of week,Publish date,First day of week,Computed day of week
7598,Sazae-san,Fuji TV,2010-11-21,18:30,30 min.,24.0,Sun,2010-11-27,2010-11-15,Sun
7497,Sazae-san,Fuji TV,2011-01-30,18:30,30 min.,24.2,Sun,2011-02-06,2011-01-24,Sun
7518,Sazae-san,Fuji TV,2011-01-16,18:30,30 min.,24.3,Sun,2011-01-23,2011-01-10,Sun
7639,Sazae-san,Fuji TV,2010-10-24,18:30,30 min.,24.7,Sun,2010-11-08,2010-10-18,Sun
2767,Chibi Maruko-chan,Fuji TV,2020-01-05,18:00,30 min.,85.0,Sun,2020-01-17,2019-12-30,Sun


Anime News Network's archives can also be accessed by the day. [Pasting in the relevant date](https://animenewsnetwork.com/news/2020-01-17), we can quickly find our ranking, where we see the problem: Chibi Maruko-chan's 8.5 rating was mistyped with a comma. Let's correct that now.

In [27]:
marukochan85 = cleaned_df['Average Household Rating'] == 85

cleaned_df.loc[marukochan85, 'Average Household Rating'] = 8.5
cleaned_df.loc[marukochan85]

Unnamed: 0,Title,Station,Date,Time,Length,Average Household Rating,Listed day of week,Publish date,First day of week,Computed day of week
2767,Chibi Maruko-chan,Fuji TV,2020-01-05,18:00,30 min.,8.5,Sun,2020-01-17,2019-12-30,Sun


That's our only real outlier -- the Sazae-san ratings are quite plausible for that period. Let's turn to the `Length` and `Time` columns, which you'll have noticed have a typical format -- let's see if any rows don't fit those formats.

In [28]:
cleaned_df.loc[~cleaned_df['Length'].str.fullmatch(r"\d+ min\.")]

Unnamed: 0,Title,Station,Date,Time,Length,Average Household Rating,Listed day of week,Publish date,First day of week,Computed day of week
7545,Anime Major 4,NHK-E,2011-01-01,9:14 11:16,24 min. 25 min.,2.0,Sat,2011-01-16,2010-12-27,Sat
7546,Anime Major 4,NHK-E,2011-01-01,8:00 8:49 9:38 10:03 10:52,24-25 min.,1.9,Sat,2011-01-16,2010-12-27,Sat


In [29]:
cleaned_df.loc[~cleaned_df['Time'].str.fullmatch(r"\d+:\d+")]

Unnamed: 0,Title,Station,Date,Time,Length,Average Household Rating,Listed day of week,Publish date,First day of week,Computed day of week
7545,Anime Major 4,NHK-E,2011-01-01,9:14 11:16,24 min. 25 min.,2.0,Sat,2011-01-16,2010-12-27,Sat
7546,Anime Major 4,NHK-E,2011-01-01,8:00 8:49 9:38 10:03 10:52,24-25 min.,1.9,Sat,2011-01-16,2010-12-27,Sat


In only one case does either column diverge from the norm: On the morning of New Year's Day 2011, NHK Educational TV seems to have broadcast several episodes of the baseball anime Major. If we want to convert these columns to numeric formats, we'll have to decide what to do about them. We could just remove them altogether, or if we can find a viable source for which broadcasts were which lengths, we could split these two columns into seven.

There is one other obstacle to making use of the time column: the 30-hour system used by Japanese TV schedules. Typically, late-night broadcast times from midnight to 6 AM are listed as though they were an extension of the previous day, under the logic that viewers staying up late to watch experience them that way. Instead of 02:00 Friday, for example, a time might be written as 26:00 Thursday. Anime News Network usually corrects these to standard time, but not always.

In [30]:
hour = cleaned_df['Time'].str.split(':').str[0].astype(int)
hour.value_counts().sort_index()

Time
0       55
1       16
2        2
4        1
5        2
6       34
7       47
8     1177
9     1935
10     155
11      36
12       4
13       9
14       1
15       6
16     280
17     915
18    2856
19    1637
21      41
22       3
23     157
24       1
Name: count, dtype: int64

In [31]:
cleaned_df[hour >= 24]

Unnamed: 0,Title,Station,Date,Time,Length,Average Household Rating,Listed day of week,Publish date,First day of week,Computed day of week
2182,Attack on Titan The Final Season,NHK,2021-02-14,24:10,25 min.,2.6,Sun,2021-02-19,2021-02-08,Sun


If each row only has a single time (which will be the case if we adjust the Major marathon above), the `Date` and `Time` columns can be merged into a single column of dtype `datetime64[ns]` by adding the latter's hours and minutes to the former, as though it represented a timedelta. This will normalize any listed times of 24:00 or later.

In [32]:
# d has dtype datetime64[ns], containing the start of each date.
# t has dtype object, containing strings representing the broadcast time.
def pandas_process_30hour(d: pd.Series, t: pd.Series):
    assert t.str.fullmatch(r"\d+:\d+").all()
    return d + pd.to_timedelta(t+':00')

# If we allow multiple times in one row, using this instead will create a
# datetime column based on the first time listed.
def pandas_process_30hour_multitime(d: pd.Series, t: pd.Series):
    times = t.str.split().str.len()
    first_time = t.str.split(expand=True)[0]
    return d + pd.to_timedelta(first_time+':00'), times

def add_datetime_col(df: pd.DataFrame):
    dt = pandas_process_30hour(df['Date'], df['Time'])
    df['Datetime'] = dt
    return df

def add_datetime_col_multitime(df: pd.DataFrame):
    dt, n = pandas_process_30hour_multitime(df['Date'], df['Time'])
    df['Datetime'] = dt
    df['Times listed'] = n
    return df

If we're not converting the times from strings, we may still want to standardize the format for times before 10:00 -- some such times in the data have leading zeroes, and some don't.

In [33]:
leading_zero = cleaned_df['Time'].str.fullmatch(r"0\d:\d+")
no_leading_zero = cleaned_df['Time'].str.fullmatch(r"\d:\d+")

print(f"{leading_zero.sum()} with a leading zero, {no_leading_zero.sum()} without.")

2973 with a leading zero, 294 without.


In [34]:
cleaned_df['Time'] = cleaned_df['Time'].str.zfill(5)

leading_zero = cleaned_df['Time'].str.fullmatch(r"0\d:\d+")
no_leading_zero = cleaned_df['Time'].str.fullmatch(r"\d:\d+")

print(f"After revision:")
print(f"{leading_zero.sum()} with a leading zero, {no_leading_zero.sum()} without.")

After revision:
3267 with a leading zero, 0 without.


# Insights

While some data quality issues remain, there are still questions that can be answered, and insights gained, from the data we have. For example, from the distribution of hours above, we can see that the most-viewed anime are largely concentrated in the early evening and the morning, with a smaller grouping late at night. This invites a first question: What have been the most-viewed anime at different times of day?

We already have a series giving the hour for each airtime, so it makes sense to reuse that.

In [35]:
def top_programs_for_hour(h: int, n: int):
    # h: An hour of the day.
    # n: How many distinct combinations of show/station/day/time
    #    we want to display.
    return cleaned_df.loc[
        hour == h,
        ['Title', 'Station', 'Listed day of week', 'Time']
    ].value_counts().head(n)

We'll start by looking at the morning hours.

In [36]:
top_programs_for_hour(8, 10)

Title                                       Station   Listed day of week  Time 
Animated O-saru no George (Curious George)  NHK-E     Sat                 08:35    370
Soaring Sky! Pretty Cure                    TV Asahi  Sun                 08:30     48
Wonderful Precure!                          TV Asahi  Sun                 08:30     47
Fresh Precure!                              TV Asahi  Sun                 08:30     46
Smile Precure!                              TV Asahi  Sun                 08:30     45
Hugtto! Precure                             TV Asahi  Sun                 08:30     45
Heartcatch Precure!                         TV Asahi  Sun                 08:30     44
Star ☆ Twinkle Precure                      TV Asahi  Sun                 08:30     43
Tropical-Rouge! Precure                     TV Asahi  Sun                 08:30     43
Dokidoki! Precure                           TV Asahi  Sun                 08:30     43
Name: count, dtype: int64

Curious George has the highest individual count, but the rest of the chart is dominated by Toei Animation and TV Asahi's hugely popular girls' superhero series Pretty Cure ("Precure" for short), which changes its title once a year. In this case, we can use a regular expression to group them together.

In [37]:
hour8 = hour == 8
hour8_sum = hour8.sum()
hour8_titles = cleaned_df.loc[hour8, 'Title']
precure_sum = hour8_titles.str.contains(r"Pre.+ure").sum()
george_sum = hour8_titles.str.contains(r"Curious George").sum()

print(
    f"The {hour8_sum} listings for the 08:00-09:00 hour are composed of "
    f"{precure_sum} for Pretty Cure, {george_sum} for Curious George, and "
    f"{hour8_sum - precure_sum - george_sum} for other anime."
)

The 1177 listings for the 08:00-09:00 hour are composed of 729 for Pretty Cure, 386 for Curious George, and 62 for other anime.


The numbers for Curious George differ slightly due to occasional variations in time and air title:

In [38]:
cleaned_df.loc[
    hour8 & cleaned_df['Title'].str.contains(r"Curious George"),
    ['Title', 'Time']
].value_counts()

Title                                                 Time 
Animated O-saru no George (Curious George)            08:35    370
                                                      08:00      8
                                                      08:30      2
Animated O-saru no George (Curious George) Selection  08:35      2
Animated O-saru no George (Curious George)            08:10      1
                                                      08:25      1
Animated O-saru no George (Curious George) Selection  08:20      1
Animated O-saru no George (Curious George) Special    08:25      1
Name: count, dtype: int64

Most of the remaining morning listings are for the 9:00 hour, which we'll examine next:

In [39]:
top_programs_for_hour(9, 10)

Title                                         Station  Listed day of week  Time 
One Piece                                     Fuji TV  Sun                 09:30    843
Oshiri Tantei                                 NHK-E    Sat                 09:00    223
Toriko                                        Fuji TV  Sun                 09:00    139
Dragon Ball Super                             Fuji TV  Sun                 09:00    128
Animation Hitsuji no Shaun (Shaun the Sheep)  NHK-E    Sat                 09:00     97
GeGeGe no Kitarō                              Fuji TV  Sun                 09:00     96
Dragon Ball Kai                               Fuji TV  Sun                 09:00     94
Gegege no Kitarō                              Fuji TV  Sun                 09:00     89
Dragon Ball Z Kai                             Fuji TV  Sun                 09:00     58
One Piece Special Edition                     Fuji TV  Sat                 09:55     37
Name: count, dtype: int64

This hour is dominated by One Piece, which aired on Sundays at 9:30 AM from 2006 to 2025. It's also clear that Fuji TV has aired a variety of anime in the immediately preceding timeslot with good results, including two entries into the GeGeGe no Kitarō franchise, Dragon Ball Kai and Super, and from 2011-2014, Toriko. The timeslot has also broadcast most Digimon series since the original Digimon Adventure -- see the Japanese Wikipedia's article on [Fuji TV's 9 AM Sunday anime slot](https://ja.wikipedia.org/wiki/%E3%83%95%E3%82%B8%E3%83%86%E3%83%AC%E3%83%93%E6%97%A5%E6%9B%9C%E6%9C%9D9%E6%99%82%E5%8F%B0%E6%9E%A0%E3%81%AE%E3%82%A2%E3%83%8B%E3%83%A1) for more details.

Looking at these two listings together, it's clear that the most popular morning anime are broadcast on weekends -- which stands to reason, as most of the audience will be away from home on weekday mornings. A division by day and station is also apparent. On Saturday mornings, the most-watched anime are on NHK Educational TV's [E-Tele Kids](https://ja.wikipedia.org/wiki/E%E3%83%86%E3%83%AC%E3%82%AD%E3%83%83%E3%82%BA) block for preschoolers, but Sunday mornings are for their older siblings' favorite action shows. One can imagine a young girl waking up and turning on TV Asahi to watch Pretty Cure at 8:30, then sticking around for Kamen Rider immediately afterward -- instead of staying for Super Sentai in the second half of [Super Hero Time](https://en.wikipedia.org/wiki/Super_Hero_Time), though, she changes the channel to Fuji TV for One Piece.

All this highlights the significance of timeslots to anime broadcasting. Their importance in TV broadcasting generally is common sense, but international anime fans are almost entirely divorced from the way TV anime is originally broadcast in Japan, so timeslots can easily become invisible. Consider another timeslot that has proven significant for viewership: [Yomiuri TV's Saturday evening anime](https://ja.wikipedia.org/wiki/%E8%AA%AD%E5%A3%B2%E3%83%86%E3%83%AC%E3%83%93%E5%88%B6%E4%BD%9C%E5%9C%9F%E6%9B%9C%E5%A4%95%E6%96%B9%E6%9E%A0%E3%81%AE%E3%82%A2%E3%83%8B%E3%83%A1), with the enormously popular Detective Conan at 6:00 PM since 2009, and another anime immediately before since 2013.

In [40]:
pre_conan_slot = ((cleaned_df['Station'] == 'NTV')
& (cleaned_df['Time'] == '17:30')
& (cleaned_df['Listed day of week'] == 'Sat'))

cleaned_df.loc[
    pre_conan_slot,
    ['Title', 'Date', 'Average Household Rating']
].groupby(pd.Grouper(key='Date', freq='2MS')).first().tail(20)

Unnamed: 0_level_0,Title,Average Household Rating
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2021-12-01,Yashahime: Princess Half-Demon - The Second Act,4.0
2022-02-01,Yashahime: Princess Half-Demon - The Second Act,4.8
2022-04-01,Love All Play (debut),3.1
2022-06-01,Love All Play,3.3
2022-08-01,Love All Play,2.9
2022-10-01,My Hero Academia Season 6 (Premiere),3.9
2022-12-01,My Hero Academia Season 6,5.3
2023-02-01,My Hero Academia Season 6,3.6
2023-04-01,MIX Season 2 (premiere),3.1
2023-06-01,MIX Season 2,2.8


The 5:30 PM timeslot's occupant has changed every six months since 2014. Of the series that have aired in that time, the most prominent internationally is by far My Hero Academia, which has aired in this slot since its second season. Others, such as the Ace Attorney anime or the Inuyasha sequel Yashahime, have much smaller international followings, but are still recognizable in fan circles. Hoever, those same viewers may be puzzled to find series they've never heard of on these rankings, such as Blue Miburo, Firefighter Daigo: Rescuer in Orange, Love All Play, or Hakushon Daimaō 2020. Such is the power of a valuable timeslot.

According to [figures published by the Association of Japanese Animations](https://aja.gr.jp/english/japan-anime-data), every year since 2015, a majority of minutes of TV anime produced have been for late-night anime -- a relatively new trend, as daytime minutes exceeded late-night minutes by more than 2:1 as recently as 2003, and late-night anime broadcasts were not common at all before 1996. Today, not only does late-night anime account for a majority of production, it makes up the vast majority of anime watched by international viewers. As we have seen, however, it is relatively uncommon in the live viewership rankings, for understandable reasons of convenience, and perhaps also because the bulk of them air on independent stations. It's worth knowing what the exceptions have been.

In [41]:
late_night = (hour < 6) | (hour > 20)
cleaned_df.loc[late_night, 'Title'].value_counts().head(15)

Title
Frieren: Beyond Journey's End                                23
That Time I Got Reincarnated as a Slime season 3             21
Spy×Family                                                   18
Lupin the 3rd Part 6                                         14
Dragon Ball Daima                                            11
Demon Slayer: Kimetsu no Yaiba Entertainment District Arc    10
Jujutsu Kaisen season 2                                      10
Demon Slayer: Kimetsu no Yaiba Swordsmith Village Arc         9
The Apothecary Diaries                                        7
Demon Slayer: Kimetsu no Yaiba Hashira Training Arc           6
Demon Slayer: Kimetsu no Yaiba Mugen Train Arc                6
Magilumiere Co. Ltd.                                          6
Spy×Family season 2                                           5
Attack on Titan The Final Season                              3
Edens Zero                                                    3
Name: count, dtype: int64

Every season of the overwhelmingly popular Demon Slayer: Kimetsu no Yaiba after its first is well-represented. Anime fans familiar with other titles will notice that this ranking is heavily weighted toward more recent titles -- our dataset goes back to 2007, but Frieren premiered in 2023, Spy x Family in 2022, and multiple other series listed are from 2024. Have late-night anime become more common on these rankings than they used to be, then?

In [42]:
cleaned_df.loc[late_night, 'Date'].dt.year.value_counts().sort_index()

Date
2007     6
2008     8
2009     9
2010     5
2011     4
2012     4
2013     4
2014     3
2015     7
2016     6
2017     2
2018     4
2019     5
2020     6
2021    34
2022    40
2023    51
2024    73
2025     7
Name: count, dtype: int64

Indeed they have, dramatically so. In the late 2000s, late-night shows in the top 10 anime broadcasts were a rarity, and mostly restricted to Fuji TV's relatively prestigious [noitaminA](https://en.wikipedia.org/wiki/Noitamina) block.

In [43]:
cleaned_df.loc[late_night].sort_values(by='Date').head(20)

Unnamed: 0,Title,Station,Date,Time,Length,Average Household Rating,Listed day of week,Publish date,First day of week,Computed day of week
9312,Noitamina - Nodame Cantabile,Fuji TV,2007-05-03,00:45,30 min.,5.1,Thu,2007-05-24,2007-04-30,Thu
9255,You're Under Arrest: Full Throttle,TBS,2007-10-04,01:55,30 min.,4.0,Thu,2007-11-30,2007-10-01,Thu
9243,Noitamina: Moyashimon,Fuji TV,2007-10-11,00:35,30 min.,4.9,Thu,2007-11-30,2007-10-08,Thu
9235,Noitamina: Moyashimon,Fuji TV,2007-10-18,00:45,30 min.,5.3,Thu,2007-11-30,2007-10-15,Thu
9161,Noitamina: Moyashimon [Tales of Agriculture],Fuji TV,2007-12-06,00:45,30 min.,4.9,Thu,2007-12-14,2007-12-03,Thu
9141,Noitamina: Moyashimon [Tales of Agriculture] (...,Fuji TV,2007-12-20,01:00,30 min.,5.2,Thu,2007-12-30,2007-12-17,Thu
9121,Noitamina: Hakaba Kitarō,Fuji TV,2008-01-10,00:45,30 min.,4.8,Thu,2008-01-22,2008-01-07,Thu
8895,Noitamina: Library War,Fuji TV,2008-06-05,00:55,30 min.,4.5,Thu,2008-06-15,2008-06-02,Thu
8815,Friday Special Roadshow Lupin III Special: Swe...,NTV,2008-07-25,21:03,111 min.,14.4,Fri,2008-08-05,2008-07-21,Fri
8777,Friday Special Roadshow Death Note Rewrite 2: ...,NTV,2008-08-22,21:03,111 min.,11.8,Fri,2008-09-02,2008-08-18,Fri


The reader may be curious to know what programs appear most often in these rankings, if most (but by no means all) of the ones best-known internationally are absent:

In [44]:
cleaned_df['Title'].value_counts().head(15)

Title
Chibi Maruko-chan                               844
One Piece                                       843
Sazae-san                                       839
Detective Conan                                 805
Doraemon                                        661
Crayon Shin-chan                                630
Animated O-saru no George (Curious George)      383
Oshiri Tantei                                   242
Toriko                                          139
Dragon Ball Super                               128
Animation Hitsuji no Shaun (Shaun the Sheep)    105
Pocket Monsters: Diamond & Pearl                 98
GeGeGe no Kitarō                                 96
Soreike! Anpanman                                95
Dragon Ball Kai                                  95
Name: count, dtype: int64

Again, we must keep in mind that Pretty Cure changes its title annually; if all series were represented under a single title, they would be fifth in this ranking. Pokémon faces a similar issue, having aired under several different titles, though it doesn't change as often.

In [45]:
print("Total Pokémon entries: {}".format(
    cleaned_df['Title'].str.contains(r"Pocket Mon|Pokémon", regex=True).sum()
))

Total Pokémon entries: 448


Most weeks, the highest-rated anime is Sazae-san, a sitcom that holds the Guinness World Record for the longest-running animated television series, having aired since 1969. In a typical year, it airs a new episode almost every week, with exceptions for holidays. The series has never been released internationally, nor has it ever been released on home video due to the original creator's wishes.

It is very rare for a new episode of Sazae-san to air, but *not* be the highest-rated anime. So rare, in fact, that the instances in our dataset can all be listed individually.

In [46]:
def by_week(df: pd.DataFrame):
    return df.groupby('First day of week')

def sazae_san_dethroned(week_df: pd.DataFrame):
    if (~week_df['Title'].str.contains('Sazae')).all():
        return False # No Sazae-san episode this week.
    if "Sazae" in week_df.loc[
        week_df['Average Household Rating'].idxmax(),
        'Title'
    ]:
        return False # Sazae-san was the top show.
    return True

sazae_dethroned_weeks = by_week(cleaned_df).filter(sazae_san_dethroned)
conquerers = by_week(sazae_dethroned_weeks)['Average Household Rating'].idxmax()
cleaned_df.loc[conquerers].sort_values(by='Date')

Unnamed: 0,Title,Station,Date,Time,Length,Average Household Rating,Listed day of week,Publish date,First day of week,Computed day of week
8460,NTV 55/YTV 50 Friday Special Roadshow: Lupin I...,NTV,2009-03-27,21:00,129 min.,19.5,Fri,2009-04-06,2009-03-23,Fri
6660,One Piece Episode of Nami: Kōkaishi no Namida ...,Fuji TV,2012-08-25,21:00,130 min.,11.6,Sat,2012-09-02,2012-08-20,Sat
5592,"One Piece ""3D2Y"" Ace no Shi wo Koete! Luffy Na...",Fuji TV,2014-08-30,21:00,130 min.,12.2,Sat,2014-09-08,2014-08-25,Sat
4697,Crayon Shin-chan,TV Asahi,2016-05-20,19:30,24 min.,8.3,Fri,2016-05-26,2016-05-16,Fri
2431,Detective Conan,NTV,2020-08-22,17:30,30 min.,7.9,Sat,2020-08-28,2020-08-17,Sat
2400,Kinyō Road Show! Meitantei Conan Episode ONE C...,NTV,2020-09-11,21:00,114 min.,10.4,Fri,2020-09-18,2020-09-07,Fri
2390,Kinyō Road Show! Meitantei Conan: Edogawa Cona...,NTV,2020-09-18,21:00,114 min.,10.5,Fri,2020-09-28,2020-09-14,Fri
2360,Demon Slayer: Kimetsu no Yaiba: Bonds of Siblings,Fuji TV,2020-10-10,21:00,130 min.,16.7,Sat,2020-10-16,2020-10-05,Sat
2350,Kimetsu no Yaiba: Natagumoyama-hen,Fuji TV,2020-10-17,21:00,170 min.,15.4,Sat,2020-10-23,2020-10-12,Sat
2259,Demon Slayer: Kimetsu no Yaiba: Hashira Gō Kai...,Fuji TV,2020-12-20,18:59,135 min.,14.4,Sun,2020-12-25,2020-12-14,Sun


The first broadcasts listed are all specials over two hours long. A 2016 broadcast of Crayon Shin-chan was the first regular anime broadcast to outperform Sazae-san since at least 2007, and the next time after that wasn't until 2020. It is a testament to the colossal popularity of Demon Slayer that it has achieved this feat as many times as it has in a four-year span, including with regular episodes airing after 11:00 PM.

The reader may notice, however, that dethroning Sazae-san doesn't take quite the numbers it used to, as shown by the `Average Household Rating` column. Indeed, Sazae-san's average ratings have declined considerably:

In [47]:
cleaned_df.loc[
    cleaned_df['Title'].str.contains('Sazae')
].groupby(pd.Grouper(key='Date', freq='YS'))['Average Household Rating'].mean()

Date
2007-01-01    18.317391
2008-01-01    17.910000
2009-01-01    18.161702
2010-01-01    19.958696
2011-01-01    19.185714
2012-01-01    17.752083
2013-01-01    16.810417
2014-01-01    15.842000
2015-01-01    13.931250
2016-01-01    12.097959
2017-01-01    11.908511
2018-01-01    11.954167
2019-01-01    12.094118
2020-01-01    10.090196
2021-01-01     9.336000
2022-01-01     8.518750
2023-01-01     7.880000
2024-01-01     7.440000
2025-01-01     8.275000
Freq: YS-JAN, Name: Average Household Rating, dtype: float64

It still usually tops the charts, of course, because the phenomenon is far from unique to Sazae-san. As in most countries, live TV viewership has declined over time in Japan, as people move to time-shifting, streaming, and other forms of entertainment altogether. We can see this trend more broadly in the anime charts by looking at what numbers are required to break into the *bottom* of the chart on the average week of each year:

In [48]:
bottom_ratings = by_week(cleaned_df)['Average Household Rating'].min()
bottom_ratings.groupby(pd.Grouper(freq='YS')).mean()

First day of week
2007-01-01    4.996000
2008-01-01    5.242308
2009-01-01    5.113725
2010-01-01    5.173077
2011-01-01    4.551923
2012-01-01    4.220755
2013-01-01    4.086538
2014-01-01    4.480769
2015-01-01    3.870588
2016-01-01    3.305769
2017-01-01    3.221154
2018-01-01    2.903774
2019-01-01    2.834615
2020-01-01    2.800000
2021-01-01    2.478846
2022-01-01    2.144231
2023-01-01    1.861538
2024-01-01    1.807547
2025-01-01    1.775000
Freq: YS-JAN, Name: Average Household Rating, dtype: float64

We can also look at another long-running series like One Piece:

In [49]:
def avg_rating_by_year(exact_title: str):
    by_year = cleaned_df.loc[
        cleaned_df['Title'] == exact_title
    ].groupby(pd.Grouper(key='Date', freq='YS'))

    summary = pd.DataFrame({
        'Average rating': by_year['Average Household Rating'].mean(),
        'Episodes charted': by_year.size(),
    }).reset_index()

    return pd.DataFrame({
        'Year': summary['Date'].dt.year,
        'Average rating': summary['Average rating'],
        'Episodes charted': summary['Episodes charted'],
    })

avg_rating_by_year("One Piece")

Unnamed: 0,Year,Average rating,Episodes charted
0,2007,8.795238,21
1,2008,7.859574,47
2,2009,9.914583,48
3,2010,11.47551,49
4,2011,10.166667,48
5,2012,9.083673,49
6,2013,7.99375,48
7,2014,8.122,50
8,2015,8.43617,47
9,2016,7.854348,46


Or, with a little more work, comparing each Pretty Cure series, which I'll label with the year it started in (generally in February).

In [50]:
def normalize_precure_titles(t: pd.Series):
    # Remove parenthetical notes like (first episode) and (final episode)
    t = t.str.replace(r" \(.*?\)$", '', regex=True)

    # Two cases where such notes appear without parentheses
    t = t.str.replace(r" Fi\w+ Episode$", '', regex=True)

    # Spelling of "Precure" wasn't standardized at first
    t = t.str.replace(r"Pre-Cure", 'Precure', regex=True)
    return t

# Omit six reruns from when production was delayed due to COVID-19.
# These are listed as "Healin' Good Precure Osarai Selection".
precure_rows = cleaned_df.loc[
    cleaned_df['Title'].str.contains(r"Pre.+ure")
    & ~cleaned_df['Title'].str.endswith("Osarai Selection")
].copy()

precure_rows['Title'] = normalize_precure_titles(precure_rows['Title'])
precure_by_title = precure_rows.groupby('Title')
pd.DataFrame({
    'Year': precure_by_title['Date'].min().dt.year,
    'Average rating': precure_by_title['Average Household Rating'].mean(),
    'Episodes charted': precure_by_title.size(),
}).sort_values(by='Year').reset_index()

Unnamed: 0,Title,Year,Average rating,Episodes charted
0,Yes! Precure 5,2007,6.516,25
1,Yes! Precure 5 GoGo!,2008,5.772973,37
2,Fresh Precure!,2009,6.553191,47
3,Heartcatch Precure!,2010,6.637778,45
4,Suite Precure,2011,5.27619,42
5,Smile Precure!,2012,5.354348,46
6,Dokidoki! Precure,2013,4.927273,44
7,HappinessCharge PreCure!,2014,4.854839,31
8,Go! Princess Precure,2015,4.255882,34
9,Mahō Tsukai Precure!,2016,3.6375,24


This may be related to the phenomenon of late-night anime making the charts more often; they could be more resistant to the trends that have driven viewership down. It's likely not the only cause, however -- the wild success of Demon Slayer makes it clear that late-night anime is simply far more mainstream than it once was. The likes of Spy x Family and Jujutsu Kaisen have regularly ranked among Japan's best-selling manga, and were recognized early on as potential major hits, yet their anime adaptations aired late at night, and to great success.

I hope this has proven of interest, and that it will lead to further research -- as well as improved fan understanding of the works they love, and the systems that produce them.