# My Hero Academia Fetcher

![my_hero_academia](https://img1.hulu.com/user/v3/artwork/36e318dc-3daf-47fb-8219-9e3cb5cd28f2?base_image_bucket_name=image_manager&base_image=2d0d3308-9323-4716-b7d8-03f171c844af&size=1200x630&format=jpeg)

I love My Hero Academia. I'm an avid reader of the weekly manga that comes out. I read the early English translations of the scans that come out every Friday (mostly). The release times vary, but it's generally available in the afternoon or
evening most Fridays. 

When I'm looking for the next issue I normally search by issue number, which takes memorization. I don't mind it, but there's certainly a better way. Also, sometimes there are break weeks where there is no release at all. I want to know about those as well, and sometimes it's useful to have a reminder that it's a break week.

This script will be used to check [MHARead](https://mharead.com) to see when the newest translation is available. Once it's available, it will send me a text message and an email with the link so that I can go ahead and read the latest issue. If it's a break week, it will also send me a text to remind me that there will be no comic. The script will also keep track of the current, previous, and next issues coming up.

## Tools used

For this project, I will be using `requests`, `BeautifulSoup`,`re`, `time`, `date`, and `schedule`. Additionally, I'll be using Windows TaskScheduler to kick it off every Friday. The overall plan is to use `requests` and `BeautifulSoup` to scrape the site to check for availability, and `re` will be used to handle searching for specific text. Finally, I'll be using `twilio` to send myself text messages from the script to either send me the link to the comic or let me know it's a break week. 

## Program Flow

On a given Friday:
- Windows TaskScheduler kicks off the script at 8:00 a.m. EST
- The script first checks the date and compares that to the next known release date
- If the current date and release date match, then the script begins to check the site for the manga.
- If the manga is on the site, then it will send me a link to the page via text message
- If the manga is not on the site, then the script will run again in two hours to check
- If the current date and release date don't match, then the script will not run and I will get a message saying that it is a break week and letting me know when the next issue will be available




In [163]:
#!pip install requests
#!pip install twilio>=6.0.0
#!pip install python-dotenv
#!pip install schedule
import requests
import re
import os
import json
import schedule
import time
import threading
from datetime import date, datetime, timedelta
from bs4 import BeautifulSoup
from twilio.rest import Client
from dotenv import load_dotenv

In [125]:
load_dotenv()

True

In [133]:
"""
    Opens the issues.json file and
    returns a json object for parsing
"""
def get_issues(path):
    r = open(path,'r')
    issues = json.load(r)
    r.close()
    return issues


"""
    Writes changes to the issues.json file
    back to the source
"""
def update_issues(path,updates):
    w = open(path,'w')
    json.dump(updates,w)
    w.close()

    
"""
    Sends a text message to my phone using the Twilio API
"""    
def send_message(message):
    client = Client(os.getenv('TWILIO_SID')
                   ,os.getenv('TWILIO_AUTH_TOKEN'))
    client.messages.create(to=os.getenv('MY_PHONE')
                          ,from_=os.getenv('TWILIO_NUMBER')
                          ,body=message)


"""
    Returns soup for the requested issue
"""
def get_soup(url):
    html = requests.get(url)
    return BeautifulSoup(html.text,'html5lib')


"""
    Takes in a soup object and returns the next 
    release date as a string
"""
def get_release_date(soup):
    default_nxt_release = (date.today()+timedelta(7))\
                           .strftime('%m/%d/%Y')
    date_pattern = r".{1,2}\/.{1,2}\/.{4}"
    pattern = re.compile(date_pattern)
    # We will try to find the next release date on the website. 
    # If it's not there,then we will use next the Friday's date
    next_release = [item.text for item in soup.find_all('a')
                    if pattern.match(item.text)]
    # If there is nothing found for the next release date, 
    # return the default date
    return default_nxt_release if len(next_release) < 1 \
           else next_release[0]
    

In [166]:
# Get issue info
issues = get_issues('issues.json')
prev_issue,curr_issue,next_issue,next_release = issues.values()

# Check if today is the release date
is_release = True #assume we have a release since it's a Friday, unless we have a break
if len(next_release) > 1:
    release_date = [int(item) for item in next_release.split("/")]
    release_date = datetime(release_date[2],release_date[0],release_date[1]).date()
    is_release = date.today() == release_date

# If today is the release date, then continue. Otherwise, stop the script and send the break message
# Break Message:
#send_message(f"Looks like the issue isn't available yet! The release date for the next installment is: {next_release[0]}")


# If today is the release date...

In [187]:
url = f"https://mharead.com/manga/boku-no-hero-academia-my-hero-academia-chapter-{curr_issue}/"
soup = get_soup(url)

In [188]:
job_interval = ((60**2)*2)


"""Continuously run, while executing pending jobs at each
elapsed time interval.
@return cease_continuous_run: threading. Event which can
be set to cease continuous run. Please note that it is
*intended behavior that run_continuously() does not run
missed jobs*. For example, if you've registered a job that
should run every minute and you set a continuous run
interval of one hour then your job won't be run 60 times
at each interval but only once.
"""
def run_continuously(interval=job_interval):

    cease_continuous_run = threading.Event()

    class ScheduleThread(threading.Thread):
        @classmethod
        def run(cls):
            while not cease_continuous_run.is_set():
                schedule.run_pending()
                time.sleep(interval)

    continuous_thread = ScheduleThread()
    continuous_thread.start()
    return cease_continuous_run



"""
    This checks to see if our manga is ready to view.
    Returns True if it is available; False otherwise.
"""
def is_here():
    try:
        #If this works, then the manga is not out yet. 
        is_out = soup.find_all('h1')[1]
        return False
    except IndexError:
        #The manga is available!
        return True
    
    
"""
    Sends SMS with link to latest manga issue
"""
def get_mha(url):
    #Send the notification
    message = f"The next issue is here! \n{url}"
    send_message(message)
    
    
"""
    Updates JSON file with latest info
""" 
def update_mha_issue():
    #Update the JSON file with new information
    issues['prev_issue'] += 1
    issues['curr_issue'] += 1
    issues['next_issue'] += 1
    issues['next_release'] = get_release_date(soup)
    update_issues('issues.json',issues)
    
    
"""

schedule.every(2).hours.do(is_here)
mha_available = is_here()
#Start the background thread
stop_run_continuously = run_continuously()

while not mha_available:
    schedule.run_pending()

if is_here():
    #Stop the background job
    stop_run_continuously.set()
    #The manga is here! Send the link 
    #and update the json file for next week
    get_mha(url)
    update_mha_issue()
"""

'\n\nschedule.every(2).hours.do(is_here)\nmha_available = is_here()\n#Start the background thread\nstop_run_continuously = run_continuously()\n\nwhile not mha_available:\n    schedule.run_pending()\n\nif is_here():\n    #Stop the background job\n    stop_run_continuously.set()\n    #The manga is here! Send the link \n    #and update the json file for next week\n    get_mha(url)\n    update_mha_issue()\n'

In [189]:
# Test out our new functions
if is_here():
    print("I am here!")
    get_mha(url)
else:
    print("Something's up!")
    print(curr_issue)
    print(url)
    is_out = soup.find_all('h1')
    print(is_out)

Something's up!
330
https://mharead.com/manga/boku-no-hero-academia-my-hero-academia-chapter-330/
[<h1 class="site-title" id="site-title" itemprop="headline">
								<a href="https://mharead.com/" rel="home">Boku No Hero Academia – My Hero Academia Manga Online</a>
							</h1>, <h1 class="entry-title" itemprop="headline">Boku No Hero Academia – My Hero Academia</h1>]


# TODO for next time:

- Need to rework logic for checking if the manga is there or not
- Create unit tests for functions
- Test out schedule
- Break these out into individual scripts