## Scraping Tapology.com for UFC

I want to scrape tapology.com for the bout information of UFC events. I'm looking to create a few data frames that I will convert into csv's for further exploration. I'll start with importing the following modules:

In [3]:
%load_ext autoreload
%autoreload 2

import os
import sys

module_path = os.path.abspath(os.path.join(os.pardir))
if module_path not in sys.path:
    sys.path.append(module_path)

from bs4 import BeautifulSoup
import requests
import pandas as pd
import src
from src import object_test
from src import open_page as op

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## Scraping an individual event

I'll take the last one as an example and navigate to the page. I should be able to concatenate the link string onto the website url.

In [8]:
event_url = 'fightcenter/events/65142-ufc-on-espn-27'
event_url

'fightcenter/events/65142-ufc-on-espn-27'

Okay, now that I have the link I'm going pull all the tables from that url and see what's in them.

The last two are tables that are advertising other parts of the website (list of upcoming events and a list of the best WW fights). The first three are useless as well.

# Beautiful Soup

Instead I'll use Beautiful Soup. First let's find all the lists.

In [9]:
html = requests.get('https://www.tapology.com/'+event_url).content
soup = BeautifulSoup(html, 'html.parser')
all_lists = soup.find_all('ul')
all_lists[0]

<ul>
<li class="field"><div class="input string required identities_password_uid"><input class="string required show_hint" id="identities_password_uid" name="identities_password[uid]" title="Username" type="text"/></div></li>
<li class="field"><div class="input password optional identities_password_password"><input class="password optional show_hint" id="identities_password_password" name="identities_password[password]" title="Password" type="password"/></div></li>
<li class="btn"><input class="btn tapInSlider submit" data-disable-with="Create Password" name="commit" type="submit" value="Create Password"/></li>
<input id="identities_password_remember" name="identities_password[remember]" type="hidden" value="true"/>
</ul>

The first list looks like it contains login info. Really we just want the bout information list and the event information list. The event info is under the clearfix class and the bout info is under the fightCard class. First let's get the bout info because it's the most important.

In [10]:
bout_info = soup.find(class_='fightCard')
# bout_info_rows = bout_info.find_all('li')
# bout_info_rows

This doesn't look like the same fightcard, so I'm going to look at all the fight card lists on this page.

In [11]:
fightCards = soup.find_all(class_='fightCard')
len(fightCards)

6

## User-Agent
The previous set of codes seems to give me a random page from tapology, I'm going to try setting a user-agent and see if that let's me access the page.

In [13]:
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'}

response = requests.get('https://www.tapology.com/'+event_url, headers=headers)
response

<Response [200]>

# Scraping the bouts table
first step is to get the bout element, under the fightCard class

In [31]:
soup = BeautifulSoup(response.text, 'html.parser')
bout_info = soup.find(class_='fightCard')

#here's a table that will be ready to populate
df = pd.DataFrame()

li_bouts = bout_info.find_all('div', class_='fightCardBout')
len(li_bouts)


12

Given 1 of those 12 li elements, what can I do.

In [32]:
first_bout = li_bouts[0]
bout_dict = {}

Is there a way to get the class names of every element inside?

In [33]:
for elem in first_bout.find_all(recursive=True):
    print(elem.get('class'), str(elem)[:8])

['fightCardResultHolder'] <div cla
['fightCardResult'] <div cla
['title'] <span cl
None <br/>
['result'] <span cl
None <br/>
['time'] <span cl
['fightCardBoutNumber'] <div cla
['fightCardFighterImage'] <div cla
None <img alt
['fightCardFighterBout', 'left', 'win'] <div cla
['fightCardFighterName', 'left'] <div cla
None <a href=
['resultIcon'] <span cl
None <img alt
['fightCardRecord'] <div cla
['fighterFlag'] <span cl
['fightCardFlag'] <img alt
['fightCardFighterRank'] <div cla
['fightCardMatchup'] <div cla
None <table>

None <tr>
<td
None <td>
<sp
['billing'] <span cl
None <a href=
None <br/>
['fightCardWeight'] <div cla
['title'] <span cl
None <img alt
['weight'] <span cl
['title'] <span cl
None <img alt
None <br/>
['fightCardFighterBout', 'loss', 'right'] <div cla
['fightCardFighterName', 'right'] <div cla
['resultIcon'] <span cl
None <img alt
None <a href=
['fightCardRecord'] <div cla
['fighterFlag'] <span cl
['fightCardFlag'] <img alt
['fightCardFighterRank'] <div cla
['fightCardF

The first two things I want are the result and the time

In [34]:
bout_dict['method'] = [first_bout.find(class_='result').get_text()]
bout_dict

{'method': ['\nKO/TKO, Right Cross to Ground and Pound\n']}

In [35]:
bout_dict['time'] = [first_bout.find(class_='time').get_text()]
bout_dict

{'method': ['\nKO/TKO, Right Cross to Ground and Pound\n'],
 'time': ['1:54 Round 2 of 5, 6:54 Total']}

Now I want their names

In [36]:
names = first_bout.find_all(class_='fightCardFighterName')

bout_dict['fighter_0'] = [names[0].get_text()]
bout_dict

{'method': ['\nKO/TKO, Right Cross to Ground and Pound\n'],
 'time': ['1:54 Round 2 of 5, 6:54 Total'],
 'fighter_0': ['\nDeiveson Figueiredo\n\n\n']}

In [37]:
bout_dict['fighter_1'] = [names[1].get_text()]
bout_dict

{'method': ['\nKO/TKO, Right Cross to Ground and Pound\n'],
 'time': ['1:54 Round 2 of 5, 6:54 Total'],
 'fighter_0': ['\nDeiveson Figueiredo\n\n\n'],
 'fighter_1': ['\n\n\nJoseph Benavidez\n']}

Now their weight

In [38]:
bout_dict['weight'] = [first_bout.find(class_='weight').get_text()]
bout_dict

{'method': ['\nKO/TKO, Right Cross to Ground and Pound\n'],
 'time': ['1:54 Round 2 of 5, 6:54 Total'],
 'fighter_0': ['\nDeiveson Figueiredo\n\n\n'],
 'fighter_1': ['\n\n\nJoseph Benavidez\n'],
 'weight': ['125']}

In [39]:
bout_dict['billing'] = [first_bout.find(class_='billing').get_text()]
bout_dict

{'method': ['\nKO/TKO, Right Cross to Ground and Pound\n'],
 'time': ['1:54 Round 2 of 5, 6:54 Total'],
 'fighter_0': ['\nDeiveson Figueiredo\n\n\n'],
 'fighter_1': ['\n\n\nJoseph Benavidez\n'],
 'weight': ['125'],
 'billing': ['\nMain Event\n']}

In [40]:
bout_dict['bout_number'] = [first_bout.find(class_='fightCardBoutNumber').get_text()]
bout_dict

{'method': ['\nKO/TKO, Right Cross to Ground and Pound\n'],
 'time': ['1:54 Round 2 of 5, 6:54 Total'],
 'fighter_0': ['\nDeiveson Figueiredo\n\n\n'],
 'fighter_1': ['\n\n\nJoseph Benavidez\n'],
 'weight': ['125'],
 'billing': ['\nMain Event\n'],
 'bout_number': ['12']}

In [41]:
bout_dict['event_link'] = [event_url]
bout_dict

{'method': ['\nKO/TKO, Right Cross to Ground and Pound\n'],
 'time': ['1:54 Round 2 of 5, 6:54 Total'],
 'fighter_0': ['\nDeiveson Figueiredo\n\n\n'],
 'fighter_1': ['\n\n\nJoseph Benavidez\n'],
 'weight': ['125'],
 'billing': ['\nMain Event\n'],
 'bout_number': ['12'],
 'event_link': ['fightcenter/events/65142-ufc-on-espn-27']}

In [42]:
bout_dict['link'] = [first_bout.find(class_='billing').find('a').get('href')]
bout_dict

{'method': ['\nKO/TKO, Right Cross to Ground and Pound\n'],
 'time': ['1:54 Round 2 of 5, 6:54 Total'],
 'fighter_0': ['\nDeiveson Figueiredo\n\n\n'],
 'fighter_1': ['\n\n\nJoseph Benavidez\n'],
 'weight': ['125'],
 'billing': ['\nMain Event\n'],
 'bout_number': ['12'],
 'event_link': ['fightcenter/events/65142-ufc-on-espn-27'],
 'link': ['/fightcenter/bouts/475556-ufc-on-espn-27-joseph-the-beefcake-benavidez-vs-deiveson-deus-da-guerra-figueiredo']}

In [43]:
# now I can update my df with this info by turning it into a df and using combine first

bout_df = pd.DataFrame(bout_dict)

df = df.combine_first(bout_df)

In [44]:
df

Unnamed: 0,method,time,fighter_0,fighter_1,weight,billing,bout_number,event_link,link
0,"\nKO/TKO, Right Cross to Ground and Pound\n","1:54 Round 2 of 5, 6:54 Total",\nDeiveson Figueiredo\n\n\n,\n\n\nJoseph Benavidez\n,125,\nMain Event\n,12,fightcenter/events/65142-ufc-on-espn-27,/fightcenter/bouts/475556-ufc-on-espn-27-josep...


Now loop

In [48]:
for bout in li_bouts:
    bout_dict = {}

    bout_dict['method'] = [bout.find(class_='result').get_text()]

    bout_dict['time'] = [bout.find(class_='time').get_text()]

    names = bout.find_all(class_='fightCardFighterName')
    bout_dict['fighter_0'] = [names[0].get_text()]
    bout_dict['fighter_1'] = [names[1].get_text()]

    bout_dict['weight'] = [bout.find(class_='weight').get_text()]

    bout_dict['billing'] = [bout.find(class_='billing').get_text()]

    bout_dict['bout_number'] = [bout.find(class_='fightCardBoutNumber').get_text()]

    bout_dict['event_link'] = [event_url]

    bout_dict['link'] = [bout.find(class_='billing').find('a').get('href')]

    # now I can update my df with this info by turning it into a df and using combine first

    bout_df = pd.DataFrame(bout_dict)

    df = pd.concat([df, bout_df])

df.set_index('link')

Unnamed: 0_level_0,method,time,fighter_0,fighter_1,weight,billing,bout_number,event_link
link,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
/fightcenter/bouts/475556-ufc-on-espn-27-joseph-the-beefcake-benavidez-vs-deiveson-deus-da-guerra-figueiredo,"\nKO/TKO, Right Cross to Ground and Pound\n","1:54 Round 2 of 5, 6:54 Total",\nDeiveson Figueiredo\n\n\n,\n\n\nJoseph Benavidez\n,125,\nMain Event\n,12,fightcenter/events/65142-ufc-on-espn-27
/fightcenter/bouts/475556-ufc-on-espn-27-joseph-the-beefcake-benavidez-vs-deiveson-deus-da-guerra-figueiredo,"\nKO/TKO, Right Cross to Ground and Pound\n","1:54 Round 2 of 5, 6:54 Total",\nDeiveson Figueiredo\n\n\n,\n\n\nJoseph Benavidez\n,125,\nMain Event\n,12,fightcenter/events/65142-ufc-on-espn-27
/fightcenter/bouts/475132-ufc-on-espn-27-felicia-feenom-spencer-vs-zarah-infinite-fairn,"\nKO/TKO, Ground and Pound\n",3:37 Round 1 of 3,\nFelicia Spencer\n\n\n,\n\n\nZarah Fairn\n,145,\nCo-Main Event\n,11,fightcenter/events/65142-ufc-on-espn-27
/fightcenter/bouts/478326-ufc-on-espn-27-ion-the-hulk-cutelaba-vs-magomed-ankalaev,"\nKO/TKO, Punches\n",0:38 Round 1 of 3,\nMagomed Ankalaev\n\n\n,\n\n\nIon Cutelaba\n,205,\nMain Card\n,10,fightcenter/events/65142-ufc-on-espn-27
/fightcenter/bouts/474746-ufc-on-espn-27-megan-anderson-vs-norma-imortal-dumont,"\nKO/TKO, Right Cross\n",3:31 Round 1 of 3,\nMegan Anderson\n\n\n,\n\n\nNorma Dumont\n,145,\nMain Card\n,9,fightcenter/events/65142-ufc-on-espn-27
/fightcenter/bouts/489498-ufc-on-espn-27-grant-the-prophet-dawson-vs-darrick-minner,"\nSubmission, Rear Naked Choke\n","1:38 Round 2 of 3, 6:38 Total",\nGrant Dawson\n\n\n,\n\n\nDarrick Minner\n,145,\nMain Card\n,8,fightcenter/events/65142-ufc-on-espn-27
/fightcenter/bouts/475484-ufc-on-espn-27-gabriel-gabito-silva-vs-kyler-matrix-phillips,"\nDecision, Unanimous\n","3 Rounds, 15:00 Total",\nKyler Phillips\n\n\n,\n\n\nGabriel Silva\n,135,\nPrelim\n,7,fightcenter/events/65142-ufc-on-espn-27
/fightcenter/bouts/475483-ufc-on-espn-27-brendan-all-in-allen-vs-tom-breese,"\nKO/TKO, Ground and Pound\n",4:47 Round 1 of 3,\nBrendan Allen\n\n\n,\n\n\nTom Breese\n,185,\nPrelim\n,6,fightcenter/events/65142-ufc-on-espn-27
/fightcenter/bouts/478325-ufc-on-espn-27-marcin-tybur-tybura-vs-sergey-polar-bear-spivak,"\nDecision, Unanimous\n","3 Rounds, 15:00 Total",\nMarcin Tybura\n\n\n,\n\n\nSergey Spivak\n,265,\nPrelim\n,5,fightcenter/events/65142-ufc-on-espn-27
/fightcenter/bouts/490218-ufc-on-espn-27-violent-bob-ross-luis-pena-vs-steve-mean-machine-garcia-jr,"\nDecision, Unanimous\n","3 Rounds, 15:00 Total",\nLuis Pena\n\n\n,\n\n\nSteve Garcia\n,155,\nPrelim\n,4,fightcenter/events/65142-ufc-on-espn-27


## Table: Events (addition)
I'm goiing to pause the fighter instances scrape and work on making the events scrape more thorough. First I want the events dataframe available. Then I'm going to open the first event.

In [35]:
df_events = pd.read_csv('previous_ufc.csv')
first_event = df_events.loc[0]

I'm going to open the page and check the first list with the event info:

In [36]:
event_soup = BeautifulSoup(src.open_tapology_link(first_event.link))

## List parser
I realize I need a more reluable way to parse html lists for thios project. Here I will test a function that will be able to turn an html list into a pandas dataframe, ready to be merged with my other dataframes.

### Find the list element

In [37]:
div = soup.find(class_="details details_with_poster clearfix") #grab top header
event_info_elem = div.find('ul') #grab first list in top header

### Find the list items in the element

In [44]:
list_items = event_info_elem.find_all('li')
list_items = [item.get_text() for item in list_items]
list_items

['Saturday 05.30.2020 at 06:00 PM ET',
 '\nU.S. Broadcast:\nESPN\n\n',
 '\nName:\nUFC Fight Night: Woodley vs. Burns\n',
 '\nAlso Known As:\nUFC Fight Night APEX\n',
 '\nPromotion:\n\nUltimate Fighting Championship\n\n',
 '\nOwnership:\nEndeavor\n',
 '\nVenue:\nUFC APEX\n',
 '\nLocation:\nLas Vegas, Nevada, United States\n',
 '\nEnclosure:\nOctagon\n',
 '\nTV Announcers:\nBrendan Fitzgerald, Michael Bisping, Daniel Cormier\n',
 '\nRing Announcer:\nJoe Martinez\n',
 '\nPost-Fight Interviews:\nDaniel Cormier\n',
 '\nTV Ratings:\n1.02M avg. viewers (615k ESPN prelims)\n',
 '\nMMA Bouts:\n11\n',
 '\nPromotion Links:\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n',
 '\nEvent Links:\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n']

### Insert a 'header' label.
#### The info in the header may be useful later on

In [65]:
item_list = [item.split(':\n') for item in list_items]
item_list[0].insert(0, 'header')
item_list

[['header', 'Saturday 05.30.2020 at 06:00 PM ET'],
 ['\nU.S. Broadcast', 'ESPN\n\n'],
 ['\nName', 'UFC Fight Night: Woodley vs. Burns\n'],
 ['\nAlso Known As', 'UFC Fight Night APEX\n'],
 ['\nPromotion', '\nUltimate Fighting Championship\n\n'],
 ['\nOwnership', 'Endeavor\n'],
 ['\nVenue', 'UFC APEX\n'],
 ['\nLocation', 'Las Vegas, Nevada, United States\n'],
 ['\nEnclosure', 'Octagon\n'],
 ['\nTV Announcers', 'Brendan Fitzgerald, Michael Bisping, Daniel Cormier\n'],
 ['\nRing Announcer', 'Joe Martinez\n'],
 ['\nPost-Fight Interviews', 'Daniel Cormier\n'],
 ['\nTV Ratings', '1.02M avg. viewers (615k ESPN prelims)\n'],
 ['\nMMA Bouts', '11\n'],
 ['\nPromotion Links',
  '\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n'],
 ['\nEvent Links',
  '\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n']]

### Encapsulate the second element of each row in a list
#### This allows them to be made into a dataframe easily

In [69]:
item_list = [[item[0], [item[1].strip()]] for item in item_list]
item_list

[['header', ['Saturday 05.30.2020 at 06:00 PM ET']],
 ['\nU.S. Broadcast', ['ESPN']],
 ['\nName', ['UFC Fight Night: Woodley vs. Burns']],
 ['\nAlso Known As', ['UFC Fight Night APEX']],
 ['\nPromotion', ['Ultimate Fighting Championship']],
 ['\nOwnership', ['Endeavor']],
 ['\nVenue', ['UFC APEX']],
 ['\nLocation', ['Las Vegas, Nevada, United States']],
 ['\nEnclosure', ['Octagon']],
 ['\nTV Announcers', ['Brendan Fitzgerald, Michael Bisping, Daniel Cormier']],
 ['\nRing Announcer', ['Joe Martinez']],
 ['\nPost-Fight Interviews', ['Daniel Cormier']],
 ['\nTV Ratings', ['1.02M avg. viewers (615k ESPN prelims)']],
 ['\nMMA Bouts', ['11']],
 ['\nPromotion Links', ['']],
 ['\nEvent Links', ['']]]

### Convert to dataframe

In [70]:
list_df = pd.DataFrame(dict(item_list))
list_df

Unnamed: 0,header,\nU.S. Broadcast,\nName,\nAlso Known As,\nPromotion,\nOwnership,\nVenue,\nLocation,\nEnclosure,\nTV Announcers,\nRing Announcer,\nPost-Fight Interviews,\nTV Ratings,\nMMA Bouts,\nPromotion Links,\nEvent Links
0,Saturday 05.30.2020 at 06:00 PM ET,ESPN,UFC Fight Night: Woodley vs. Burns,UFC Fight Night APEX,Ultimate Fighting Championship,Endeavor,UFC APEX,"Las Vegas, Nevada, United States",Octagon,"Brendan Fitzgerald, Michael Bisping, Daniel Co...",Joe Martinez,Daniel Cormier,1.02M avg. viewers (615k ESPN prelims),11,,


{0: 0                      header
 1            \nU.S. Broadcast
 2                      \nName
 3             \nAlso Known As
 4                 \nPromotion
 5                 \nOwnership
 6                     \nVenue
 7                  \nLocation
 8                 \nEnclosure
 9             \nTV Announcers
 10           \nRing Announcer
 11    \nPost-Fight Interviews
 12               \nTV Ratings
 13                \nMMA Bouts
 14          \nPromotion Links
 15              \nEvent Links
 Name: 0, dtype: object,
 1: 0                    Saturday 05.30.2020 at 06:00 PM ET
 1                                              ESPN\n\n
 2                  UFC Fight Night: Woodley vs. Burns\n
 3                                UFC Fight Night APEX\n
 4                  \nUltimate Fighting Championship\n\n
 5                                            Endeavor\n
 6                                            UFC APEX\n
 7                    Las Vegas, Nevada, United States\n
 8               

In [37]:
info_list = event_info_elem.get_text().split('\n\n\n')#turn it into text and split into list


new_list = [] #fields with no info do not have a colon followed by a new line, so i will take those out
for item in info_list:
    if ':\n' in item:
        new_list.append(item)

info_list = '\n'.join(new_list).split('\n')
info_list

['',
 'Saturday 05.30.2020 at 06:00 PM ET',
 '',
 'U.S. Broadcast:',
 'ESPN',
 '',
 'Name:',
 'UFC Fight Night: Woodley vs. Burns',
 'Also Known As:',
 'UFC Fight Night APEX',
 'Promotion:',
 '',
 'Ultimate Fighting Championship',
 '',
 'Ownership:',
 'Endeavor',
 'Venue:',
 'UFC APEX',
 'Location:',
 'Las Vegas, Nevada, United States',
 'Enclosure:',
 'Octagon',
 'TV Announcers:',
 'Brendan Fitzgerald, Michael Bisping, Daniel Cormier',
 'Ring Announcer:',
 'Joe Martinez',
 'Post-Fight Interviews:',
 'Daniel Cormier',
 'TV Ratings:',
 '1.02M avg. viewers (615k ESPN prelims)',
 'MMA Bouts:',
 '11']

Now I'm going to remove the whitespace.

In [38]:
info_list = list(filter(lambda item: item != '', info_list))
info_list

start_time = info_list.pop(0)
info_list

['U.S. Broadcast:',
 'ESPN',
 'Name:',
 'UFC Fight Night: Woodley vs. Burns',
 'Also Known As:',
 'UFC Fight Night APEX',
 'Promotion:',
 'Ultimate Fighting Championship',
 'Ownership:',
 'Endeavor',
 'Venue:',
 'UFC APEX',
 'Location:',
 'Las Vegas, Nevada, United States',
 'Enclosure:',
 'Octagon',
 'TV Announcers:',
 'Brendan Fitzgerald, Michael Bisping, Daniel Cormier',
 'Ring Announcer:',
 'Joe Martinez',
 'Post-Fight Interviews:',
 'Daniel Cormier',
 'TV Ratings:',
 '1.02M avg. viewers (615k ESPN prelims)',
 'MMA Bouts:',
 '11']

Now I'm going to group them and turn them into a list then into a dataframe.

In [31]:
info_list = src.group(info_list, 2)

Zip the grouped info list into a dictionary and turn it into a dataframe.

In [32]:
info_df = pd.DataFrame(dict(info_list), index=[0]) #zipped the group into a dictionary, needs index param to work

Which columns do I want?

In [33]:
info_df.columns

Index(['U.S. Broadcast:', 'Name:', 'Also Known As:', 'Promotion:',
       'Ownership:', 'Venue:', 'Location:', 'Enclosure:', 'TV Announcers:',
       'Ring Announcer:', 'Post-Fight Interviews:', 'TV Ratings:',
       'MMA Bouts:'],
      dtype='object')

In [34]:
relevent_df = info_df.loc[:, ['Location:', 'Venue:', 'Enclosure:']] #remove all irrelevent data

Add the link

In [35]:
relevent_df['link'] = first_event.link

In [93]:
relevent_df

Unnamed: 0,location,venue,enclosure,start_time,link
0,"Las Vegas, Nevada, United States",UFC APEX,Octagon,\nSaturday 05.30.2020 at 06:00 PM ET\n\nU.S. B...,/fightcenter/events/69127-ufc-fight-night


Now I'll turn this into a function and see how I can join it

In [92]:
relevent_df = src.get_missing_event_info(soup, first_event.link)

[['\nName', 'UFC Fight Night: Woodley vs. Burns'], ['Also Known As', 'UFC Fight Night APEX'], ['Promotion', '\nUltimate Fighting Championship'], ['\nOwnership', 'Endeavor'], ['Venue', 'UFC APEX'], ['Location', 'Las Vegas, Nevada, United States'], ['Enclosure', 'Octagon'], ['TV Announcers', 'Brendan Fitzgerald, Michael Bisping, Daniel Cormier'], ['Ring Announcer', 'Joe Martinez'], ['Post-Fight Interviews', 'Daniel Cormier'], ['TV Ratings', '1.02M avg. viewers (615k ESPN prelims)'], ['MMA Bouts', '11']]


In [38]:
previous_ufc.head()

Unnamed: 0,event,name,date,bouts,link
0,UFC Fight Night,Woodley vs. Burns,2020-05-30,11,/fightcenter/events/69127-ufc-fight-night
1,UFC Fight Night,Overeem vs. Harris,2020-05-16,11,/fightcenter/events/67412-ufc-on-espn-33
2,UFC Fight Night,Smith vs. Teixeira,2020-05-13,10,/fightcenter/events/69126-ufc-fight-night
3,UFC 249,Ferguson vs. Gaethje,2020-05-09,11,/fightcenter/events/66312-ufc-250
10,UFC on ESPN+ 28,Lee vs. Oliveira,2020-03-14,12,/fightcenter/events/64600-ufc-on-espn-26


In [39]:
previous_ufc.join(relevent_df.set_index('link'), on='link')

Unnamed: 0,event,name,date,bouts,link,location,venue,enclosure,start_time
0,UFC Fight Night,Woodley vs. Burns,2020-05-30,11,/fightcenter/events/69127-ufc-fight-night,"Las Vegas, Nevada, United States",UFC APEX,Octagon,Saturday 05.30.2020 at 06:00 PM ET
1,UFC Fight Night,Overeem vs. Harris,2020-05-16,11,/fightcenter/events/67412-ufc-on-espn-33,,,,
2,UFC Fight Night,Smith vs. Teixeira,2020-05-13,10,/fightcenter/events/69126-ufc-fight-night,,,,
3,UFC 249,Ferguson vs. Gaethje,2020-05-09,11,/fightcenter/events/66312-ufc-250,,,,
10,UFC on ESPN+ 28,Lee vs. Oliveira,2020-03-14,12,/fightcenter/events/64600-ufc-on-espn-26,,,,
...,...,...,...,...,...,...,...,...,...
509,UFC 5,Return of the Beast,1995-04-07,10,/fightcenter/events/ufc-5-return-of-the-beast,,,,
510,UFC 4,Revenge of the Warriors,1994-12-16,10,/fightcenter/events/ufc-4-revenge-of-the-warriors,,,,
511,UFC 3,The American Dream,1994-09-09,6,/fightcenter/events/ufc-3-the-american-dream,,,,
512,UFC 2,No Way Out,1994-03-11,15,/fightcenter/events/ufc-2-no-way-out,,,,
