# Technical and Page Speed Audit

### Initial Setup 
- This script runs Screaming Frog crawls via the command line
- You need a paid subscription to Screaming Frog and a Page Speed API Key before proceeding 
    - API Key Link - https://developers.google.com/speed/docs/insights/v5/get-started
- Read More about running Screaming Frog on the command line here:
    - https://www.screamingfrog.co.uk/seo-spider/user-guide/general/#command-line

### What does the code do?

- The following code runs a screaming frog crawl via the command line. There are 2 methods in our class. The first method runs a standard technical SEO Audit and the second runs a page speed audit using the PSI API. 
- the code will also save an excel audit file to your specified output folder 

### How to prioritize my efforts?
- As an SEO Professional, it will be your decision to interpret the data and prioritize your efforts accordingly. It can be hard to fix every single technical item, so you need to decide which items warrant a fix since every site is different.
- In some cases, canonicals may be off on key pages and on others, your robots.txt file may be blocking the crawl to key subfolders
- If you have key money pages that are really slow and have a terrible UX, you may want to improve the page speed of them

### Technical SEO Audit
- the code below will run Screaming Frog Crawls via the command line / prompt, then save all the technical issues to an excel file
- Here are the issues we check for:
    - Non-Indexable URLs (200s)
    - Non-Self Referencing Canonicals 
    - Pages with Multiple H-1s
    - Pages with Missing Headings (H-1s and H-2s)
    - Structured Data Errors  
    - Redirect Chains
    - JavaScript Redirects
    - URLs that Canonicalize to Non-Indexable Pages
    - Meta Refreshes
    - All Non-200 Status Code Outlinks (For any 301 redirects, we ran an additional crawl to get the final destination URLs) 
    
    
### Page Speed Audit
- the code below will run a page speed audit via API Key (using Screaming Frog) via the command line / prompt, then save all the technical issues in an excel file
- Here are the issues we check for:
    - Page Speed Metrics at the URL Level (Including CWV)
    - Render-blocking resources
    - Assets that lengthen their cache policy 
    - Images that aren't properly sized
    - Offscreen Images that need to be deferred  
    - Images not in Next-Gen Format
    - Minified JS, CSS
    - Unused JS and CSS 
    - JS with Long Execution Time 
    - Elements avoiding layout shifts
    - Image elemnts that don't have explicit height and Width 
    - Excessive Dom Size
- Sf also has a summary tab that shows where we can improve 

In [None]:
import pandas as pd
import numpy as np
import os 

class sf_crawl_audit:

    def __init__(self,website, output_folder):

        self.website = website
        self.output_folder = output_folder
    
    def technical(self):

        ## Run a Screaming Frog Crawl on the Command Line / Prompt to return technical issues 
        
        sf_crawl = os.system('cd "C:\Program Files (x86)\Screaming Frog SEO Spider" && ScreamingFrogSEOSpiderCli.exe \
        --crawl {} --headless --output-folder {} --export-format "csv"  \
        --bulk-export "Response Codes:Redirection (JavaScript) Inlinks,Links:All Outlinks,\
        Directives:Nofollow Inlinks,Response Codes:Redirection (Meta Refresh) Inlinks" --export-tabs \
        "Internal:All,Structured Data:All,Canonicals:All" \
        --save-report "Redirects:Redirect Chains"\
        && cd {} && rename internal_all.csv internal_all_.csv'.format(website, output_folder))
        
        df = pd.read_csv(output_folder + '\internal_all_.csv')

        ### Find all the Canonical Link Elements that are non-200 status codes
        canonicals = df[(df['Canonical Link Element 1'].notna()) & (df['Status Code'] == 200) ]
        
        for y in list(canonicals['Canonical Link Element 1']):
                with open(output_folder + '\\canonical_status_code_check.txt', "a") as output:
                    output.write(y + '\n') 
        
        canonical_list = output_folder + '\\canonical_status_code_check.txt'
        crawl_canonicals = os.system('cd "C:\Program Files (x86)\Screaming Frog SEO Spider" && ScreamingFrogSEOSpiderCli.exe \
        --crawl-list {} --headless --output-folder {} --export-format "csv"  --export-tabs \
        "Internal:All"  && cd {} && rename internal_all.csv canonical_status_code.csv'.format(canonical_list,output_folder,output_folder))
        
        
        
        # Save outlink Destination URLS to a text file
        
        outlinks = pd.read_csv(output_folder +"\/all_outlinks.csv")
        
        # Create a list of all the 301 redirects and run a crawl on them (this is to find the final destination URL)
        outlinks[outlinks['Status Code'] == 301]['Destination']
        for y in list(outlinks[outlinks['Status Code'] == 301]['Destination'].unique().astype(str)):
            with open(output_folder + "\outlinks_status_code.txt", "a", encoding='utf-8') as output:
                output.write(y + '\n')   
        
        final_destination_urls = output_folder + '\\outlinks_status_code.txt'
     
        crawl_301s = os.system('cd "C:\Program Files (x86)\Screaming Frog SEO Spider" && ScreamingFrogSEOSpiderCli.exe \
        --crawl-list {} --headless --output-folder {} --export-format "csv"  --export-tabs \
        "Internal:All"  && cd {} && rename internal_all.csv final_destination_urls.csv'.format(final_destination_urls,output_folder,output_folder))
         

        ### Create DataFrame listing out Non-Indexable URLs 
        non_indexable_urls = df[(df['Canonical Link Element 1'].isna() | df['Canonical Link Element 1'] == df['Address']) & (df['Indexability'] == 'Non-Indexable') & (df['Status Code'] == 200)]


        #### Create DataFrame for all 200 status code pages with non-self referencing Canonicals
        non_self_referencing_canonicals = df[(df['Canonical Link Element 1'] != df['Address']) & (df['Status Code'] == 200)]


        ### Create DataFrame listing out Missing Headings 
        multiple_h1s = df[df['H1-2'].notna()]

        ### Create DataFrame listing out Missing Headings 
        missing_headings = df[(df['H1-1'].isna())  | (df['H2-1'].isna())]

        ### Create DataFrame listing out  Structured Data for Errors
        structured_data = pd.read_csv(output_folder + '\\structured_data_all.csv')
        structured_data_errors = structured_data[structured_data['Errors'] > 0]


        ### Redirect Chains 
        redirect_chains  = pd.read_csv(output_folder + '\\redirect_chains.csv')


        ### JavaScript Redirects 
        js_redirects  = pd.read_csv(output_folder + '\\redirection_(javascript)_inlinks.csv')


        ### Meta Refreshes
        meta_refreshes  = pd.read_csv(output_folder + '\\redirection_(meta_refresh)_inlinks.csv')


        ### Canonical Status Codes

        canonical_status_codes = pd.read_csv(output_folder + '\\canonical_status_code.csv')


        ### the following code is fidning all non-indexable canonical links then joining it with the pages that are canonicalizing
        ### to them

        non_indexable_canonicals = canonical_status_codes[(canonical_status_codes['Status Code'] != 200) | (canonical_status_codes['Indexability'] != 'Indexable') ]
        if len(non_indexable_canonicals) > 0:
            non_indexable_canonicals = non_indexable_canonicals[['Address','Status Code','Indexability']] 
            non_indexable_canonicals.rename(columns = 
                                    {'Address':'Canonical Link Element 1','Status Code':'Canonical_Link_Status_Code',
                                    'Indexability':'Canonical_Indexability'})

            non_indexable_canonicals = non_indexable_canonicals.merge(df[['Address','Canonical Link Element 1']], how = 'left', on = 'Canonical Link Element 1')

            ## create action column to callout non-indexable canonical link 
            non_indexable_canonicals['Action'] = 'Canonical Link is Non-Indexable'

        outlinks = pd.read_csv(output_folder +"\/all_outlinks.csv")

        ### Merge Final Destination URLs
        
        final_destination_urls = pd.read_csv(output_folder + '\\final_destination_urls.csv')

        final_destination_urls = final_destination_urls.rename(columns = {'Address':'Destination','Redirect URL':'Final Destination URL'})

        outlinks.merge(final_destination_urls, how = 'left', on = 'Destination', inplace = True)
        
        outlinks = outlinks[outlinks['Destination'] != 200]

        ### Save all Technical Issues to an Excel Sheet 
        with pd.ExcelWriter(output_folder  +  'Technical SEO Audit'  + '.xlsx') as writer:
                df.to_excel(writer, sheet_name ='All URLs', index = False)
                if len(non_indexable_urls) > 0:
                    non_indexable_urls.to_excel(writer, sheet_name = 'Non_Indexable', index = False)
                elif len(non_self_referencing_canonicals) > 0:
                    non_self_referencing_canonicals.to_excel(writer, sheet_name = 'Non_Self_Ref', index = False) 
                elif len(multiple_h1s) > 0:
                    multiple_h1s.to_excel(writer, sheet_name = 'Multiple_H1s', index = False) 
                elif len(missing_headings) > 0:
                    missing_headings.to_excel(writer, sheet_name = 'Missing_Hs', index = False)         
                elif len(structured_data_errors) > 0:
                    structured_data_errors.to_excel(writer, sheet_name = 'Schema_Errors', index = False)   
                elif len(redirect_chains) > 0:
                    redirect_chains.to_excel(writer, sheet_name = 'Redirect_chains', index = False)   
                elif len(js_redirects) > 0:
                    js_redirects.to_excel(writer, sheet_name = 'JS_Redirects', index = False)                     
                elif len(meta_refreshes) > 0:
                    meta_refreshes.to_excel(writer, sheet_name = 'Meta_Refresh', index = False)                     
                elif len(non_indexable_canonicals) > 0:
                    non_indexable_canonicals.to_excel(writer, sheet_name = 'Non_Index_Canonicals', index = False) 
                elif len(outlinks) > 0:
                    outlinks.to_excel(writer, sheet_name = 'Non_200_Pages', index = False) 

                    
        print('Technical Crawl Complete')
        
    
    def page_speed(self):
        ps_command = os.system('cd "C:\Program Files (x86)\Screaming Frog SEO Spider" && ScreamingFrogSEOSpiderCli.exe --crawl {} --headless --output-folder {} --export-format "csv" --use-pagespeed --export-tabs "Internal:HTML" --save-report \
        "PageSpeed:Minify JavaScript,\
        PageSpeed:JavaScript Coverage Summary,PageSpeed:Eliminate Render-Blocking Resources,\
        PageSpeed:PageSpeed Opportunities Summary,PageSpeed:CSS Coverage Summary,\
        PageSpeed:Serve Static Assets with an Efficient Cache Policy,\
        PageSpeed:Properly Size Images,PageSpeed:Defer Offscreen Images,\
        PageSpeed:Reduce Unused JavaScript,PageSpeed:Efficiently Encode Images,\
        PageSpeed:Serve Images in Next-Gen Formats,PageSpeed:Enable Text Compression,\
        PageSpeed:Minify CSS,PageSpeed:Minify JavaScript,PageSpeed:Reduce Unused CSS,\
        PageSpeed:Avoid Excessive DOM Size,PageSpeed:Reduce JavaScript Execution Time,\
        PageSpeed:Preload Key Requests,PageSpeed:Use Video Formats for Animated Content,\
        PageSpeed:Minimize Main-Thread Work,PageSpeed:Ensure Text Remains Visible During Webfont Load,\
        PageSpeed:Avoid Large Layout Shifts,\
        PageSpeed:Image Elements Do Not Have Explicit Width & Height"'.format(website,output_folder))
        
        df = pd.read_csv(output_folder + '\\internal_html.csv')

        
        df = df[['Address','Title 1', 'Indexability', 'Performance Score', 'First Contentful Paint Time (ms)',
                    'Speed Index Time (ms)', 'Largest Contentful Paint Time (ms)',
                    'Time to Interactive (ms)', 'Total Blocking Time (ms)',
                    'Cumulative Layout Shift', 'Total Size Savings (Bytes)',
                    'Total Time Savings (ms)', 'Total Requests', 'Total Page Size (Bytes)',
                    'HTML Size (Bytes)', 'HTML Count', 'Image Size (Bytes)', 'Image Count',
                    'CSS Size (Bytes)', 'CSS Count', 'JavaScript Size (Bytes)',
                    'JavaScript Count', 'Font Count','Font Size (Bytes)', 
                    'Media Size (Bytes)', 'Media Count', 'Other Size (Bytes)',
                    'Other Count',  'Third Party Count','Third Party Size (Bytes)',
                    'Core Web Vitals Assessment','Eliminate Render-Blocking Resources Savings (ms)',
                    'Efficiently Encode Images Savings (ms)',
                    'Defer Offscreen Images Savings (ms)',
                    'Properly Size Images Savings (ms)',
                    'Minify JavaScript Savings (ms)',
                    'Minify CSS Savings (ms)',
                    'Reduce Unused CSS Savings (ms)',
                    'Reduce Unused JavaScript Savings (ms)',
                    'Serve Images in Next-Gen Formats Savings (ms)',
                    'Preconnect to Required Origins Savings (ms)',
                    'Enable Text Compression Savings (ms)',
                    'Server Response Times (TTFB) (ms)', 'Multiple Redirects Savings (ms)',
                    'Preload Key Requests Savings (ms)',
                    'Use Video Format for Animated Images Savings (ms)',
                    'Total Image Optimization Savings (ms)',
                    'Crawl Timestamp']]

            # Combine CSVs into 1 single excel File
            
            dom_size = pd.read_csv(output_folder + '\\avoid_excessive_dom_size_report.csv')
            css_summary = pd.read_csv(output_folder + '\\css_coverage_summary.csv')
            defer_offscreen = pd.read_csv(output_folder + '\\defer_offscreen_images_report.csv')
            encode_images = pd.read_csv(output_folder + '\\efficiently_encode_images_report.csv')
            render_blocking = pd.read_csv(output_folder + '\\eliminate_render_blocking_resources_report.csv')
            text_compression = pd.read_csv(output_folder + '\\enable_text_compression_report.csv')
            text_remains_visible = pd.read_csv(output_folder + '\\ensure_text_remains_visible_during_webfont_load_report.csv')
            minify_css_report = pd.read_csv(output_folder + '\\minify_css_report.csv')
            minify_javascript_report = pd.read_csv(output_folder + '\\minify_javascript_report.csv')
            js_coverage_summary = pd.read_csv(output_folder + '\\js_coverage_summary.csv')
            minimize_main_thread_work_report = pd.read_csv(output_folder + '\\minimize_main_thread_work_report.csv')
            pagespeed_opportunities_summary = pd.read_csv(output_folder + '\\pagespeed_opportunities_summary.csv')
            preload_key_requests_report = pd.read_csv(output_folder + '\\preload_key_requests_report.csv')
            properly_size_images_report = pd.read_csv(output_folder + '\\properly_size_images_report.csv')
            reduce_javascript_execution_time_report = pd.read_csv(output_folder + '\\reduce_javascript_execution_time_report.csv')
            reduce_unused_css_report = pd.read_csv(output_folder + '\\reduce_unused_css_report.csv')
            reduce_unused_javascript_report = pd.read_csv(output_folder + '\\reduce_unused_javascript_report.csv')
            serve_images_in_next_gen_formats_report = pd.read_csv(output_folder + '\\serve_images_in_next_gen_formats_report.csv')
            serve_static_assets_with_an_efficient_cache_policy_report = pd.read_csv(output_folder + '\\serve_static_assets_with_an_efficient_cache_policy_report.csv')
            use_video_formats_for_animated_content_report = pd.read_csv(output_folder + '\\use_video_formats_for_animated_content_report.csv')
            avoid_large_layout_shifts = pd.read_csv(output_folder + '\\avoid_large_layout_shifts_report.csv')
            image_elements_do_not_have_explicit_width_height_report = pd.read_csv(output_folder + '\\image_elements_do_not_have_explicit_width_&_height_report.csv')

            ## save Page Speed Issues to an Excel Sheet
            with pd.ExcelWriter(output_folder  +  ' Page Speed Audit'  + '.xlsx') as writer:
                html.to_excel(writer, sheet_name ='All URLs', index = False)
                if len(page_speed_aggregation) > 0:
                    page_speed_aggregation.to_excel(writer, sheet_name = 'Page_Speed Opportunities', index = False)
                if len(text_remains_visible) > 0:
                    text_remains_visible.to_excel(writer, sheet_name = 'Text remains Visible ', index = False)
                if len(text_compression) > 0:
                    text_compression.to_excel(writer, sheet_name = 'Text Compression', index = False)
                if len(serve_static_assets_with_an_efficient_cache_policy_report) > 0:
                    serve_static_assets_with_an_efficient_cache_policy_report.to_excel(writer, sheet_name = 'Efficient Cache Policy', index = False)
                if len(render_blocking) > 0:
                    render_blocking.to_excel(writer, sheet_name = 'Render Blocking Res.', index = False)
                if len(minify_css_report) > 0:
                    minify_css_report.to_excel(writer, sheet_name = 'Minify CSS', index = False)
                if len(minify_javascript_report) > 0:
                    minify_javascript_report.to_excel(writer, sheet_name = 'Minify JS', index = False)
                if len(reduce_javascript_execution_time_report) > 0:
                    reduce_javascript_execution_time_report.to_excel(writer, sheet_name = 'Reduce JS Execution', index = False)
                if len(reduce_unused_css_report) > 0:
                    reduce_unused_css_report.to_excel(writer, sheet_name = 'Reduce Unused CSS', index = False)
                if len(reduce_unused_javascript_report) > 0:
                    reduce_unused_javascript_report.to_excel(writer, sheet_name = 'Reduce Unused JS', index = False)
                if len(minimize_main_thread_work_report) > 0:
                    minimize_main_thread_work_report.to_excel(writer, sheet_name = 'Minimize_Main_Thread', index = False)
                if len(use_video_formats_for_animated_content_report) > 0:
                    use_video_formats_for_animated_content_report.to_excel(writer, sheet_name = 'Use Video Format for Animation', index = False)
                if len(preload_key_requests_report) > 0:
                    preload_key_requests_report.to_excel(writer, sheet_name = 'Preload_Key_Reqs', index = False)               
                if len(properly_size_images_report) > 0:
                    properly_size_images_report.to_excel(writer, sheet_name = 'Properly_Size_Images', index = False)
                if len(encode_images) > 0:
                    encode_images.to_excel(writer, sheet_name = 'Encode_Images', index = False)
                if len(defer_offscreen) > 0:
                    defer_offscreen.to_excel(writer, sheet_name = 'Defer Offscreen Images', index = False)
                if len(image_elements_do_not_have_explicit_width_height_report) > 0 :
                    image_elements_do_not_have_explicit_width_height_report.to_excel(writer, sheet_name = 'Images Missing Width & Height', index = False)
                if len(avoid_large_layout_shifts) > 0 :
                    avoid_large_layout_shifts.to_excel(writer, sheet_name = 'Avoid Large Layout Shifts', index = False)

        

### Technical SEO Audit
- the code below will run Screaming Frog Crawls via the command line / prompt, then save all the technical issues in an excel file
- Here are the issues we check for:
    - Non-Indexable URLs (200s)
    - Non-Self Referencing Canonicals 
    - Pages with Multiple H-1s
    - Pages with Missing Headings (H-1s and H-2s)
    - Structured Data Errors  
    - Redirect Chains
    - JavaScript Redirects
    - URLs that Canonicalize to Non-Indexable Pages
    - Meta Refreshes
    - All Non-200 Status Code Outlinks (For any 301 redirects, we ran an additional crawl to get the final destination URLs) 
    

In [None]:
output_folder = r'c:\users\your_output_folder'
website = 'https://www.yourwebsite.com/'
sf_crawl = sf_crawl_audit(website, output_folder)


sf_crawl.technical()

#### Visualizing the Data
- In addition to saving all the data in an excel file (Which you can open up afterwards), we can also visualize the data in this jupyter notebook by running the rest of the cells below

In [None]:
df = pd.read_csv(output_folder + '\\internal_all.csv')

### Create DataFrame listing out Non-Indexable URLs 
non_indexable_urls = df[(df['Canonical Link Element 1'].isna() | df['Canonical Link Element 1'] == df['Address']) & (df['Indexability'] == 'Non-Indexable') & (df['Status Code'] == 200)]


#### Create DataFrame for all 200 status code pages with non-self referencing Canonicals
non_self_referencing_canonicals = df[(df['Canonical Link Element 1'] != df['Address']) & (df['Status Code'] == 200)]


### Create DataFrame listing out Missing Headings 
multiple_h1s = df[df['H1-2'].notna()]

### Create DataFrame listing out Missing Headings 
missing_headings = df[(df['H1-1'].isna())  | (df['H2-1'].isna())]

### Create DataFrame listing out  Structured Data for Errors
structured_data = pd.read_csv(output_folder + '\\structured_data_all.csv')
structured_data_errors = structured_data[structured_data['Errors'] > 0]


### Redirect Chains 
redirect_chains  = pd.read_csv(output_folder + '\\redirect_chains.csv')


### JavaScript Redirects 
js_redirects  = pd.read_csv(output_folder + '\\redirection_(javascript)_inlinks.csv')


### Meta Refreshes
meta_refreshes  = pd.read_csv(output_folder + '\\redirection_(meta_refresh)_inlinks.csv')


### Canonical Status Codes

canonical_status_codes = pd.read_csv(output_folder + '\\canonical_status_code.csv')


### the following code is fidning all non-indexable canonical links then joining it with the pages that are canonicalizing
### to them

non_indexable_canonicals = canonical_status_codes[(canonical_status_codes['Status Code'] != 200) | (canonical_status_codes['Indexability'] != 'Indexable') ]
if len(non_indexable_canonicals) > 0:
    non_indexable_canonicals = non_indexable_canonicals[['Address','Status Code','Indexability']] 
    non_indexable_canonicals.rename(columns = 
                            {'Address':'Canonical Link Element 1','Status Code':'Canonical_Link_Status_Code',
                            'Indexability':'Canonical_Indexability'})

    non_indexable_canonicals = non_indexable_canonicals.merge(df[['Address','Canonical Link Element 1']], how = 'left', on = 'Canonical Link Element 1')
    
    ## create action column to callout non-indexable canonical link 
    non_indexable_canonicals['Action'] = 'Canonical Link is Non-Indexable'

### 301 Redirects   
outlinks = pd.read_csv(output_folder +"\/all_outlinks.csv")

final_destination_urls = pd.read_csv(output_folder + '\\final_destination_urls.csv')

final_destination_urls = final_destination_urls.rename(columns = {'Address':'Destination','Redirect URL':'Final Destination URL'})

outlinks.merge(final_destination_urls, how = 'left', on = 'Destination', inplace = True)

#### Non-Indexable URLs
- Here are all your non-indexable 200 status Code URLs (Not Canonicalizing to another URL)

In [None]:
non_indexable_urls.head(10)

#### Missing Heading Tags
- Here are pages with missing Headings Tags (H-1s and H-2s)

In [None]:
missing_headings.head(10)

#### URL has a non-self referencing Canonical Tag
- URL's status code is 200, but it contains a non-self referencing canonical tag

In [None]:
non_self_referencing_canonicals.head(10)

#### Multiple H-1 tags
- URLs have more than 1 H-1 tag

In [None]:
multiple_h1s.head(10)

#### Structured Data Issues 
- These URLs have Structured Data Errors 

In [None]:
structured_data_errors.head(10)

#### Redirect Chains 
- Here are all the redirect chains

In [None]:
redirect_chains.head(10)

#### JavaScript Redirects 
- Here are all the JS Redirects that should be cleaned up

In [None]:
js_redirects.head(10)

#### Meta Refreshes
- Here are all the Meta Refreshes

In [None]:
meta_refreshes.head(10)

#### Canonical Status Codes
- Here are all the canonicals that are not indexable
- we joined the non-indexable canonicals with the pages canonicalizing to them so you can quickly fix which pages need their canonicals updated! 

In [None]:
non_indexable_canonicals.head(10)

#### Non 200 Status Code Outlinks
- Here are all the pages that we link to returning a non 200 status code 
- For any 301s, we also ran a crawl to return the "Final Destination URL", so you can quickly swap out the redirect link for the final destination URL. 
- Screaming Frog's Outlinks Export is great because it tells you where the link can be found (X-Path), the anchor used and the type of link (CSS, JS, Image, HTML, etc.)

In [None]:
outlinks[outlinks['Status Code'] != 200].head(10)

### Page Speed Audit
- the code below will run a page speed audit via API Key (using Screaming Frog) via the command line / prompt, then save all the technical issues in an excel file
- Here are the issues we check for:
    - Page Speed Metrics at the URL Level (Including CWV)
    - Render-blocking resources
    - Assets that lengthen their cache policy 
    - Images that aren't properly sized
    - Offscreen Images that need to be deferred  
    - Images not in Next-Gen Format
    - Minified JS, CSS
    - Unused JS and CSS 
    - JS with Long Execution Time 
    - Elements avoiding layout shifts
    - Image elemnts that don't have explicit height and Width 
    - Excessive Dom Size
- Sf also has a summary tab that shows where we can improve overall 

#### you can prioritize your efforts based on the URLs likely to drive the most Revenue (Unfortunately, I do not have that data)

In [None]:
sf_crawl.page_speed()

#### Visualizing the Data
- In addition to saving all the data in an excel file (Which you can open up afterwards), we can also visualize the summary data below

#### CSS Summary 
- Here is the CSS Summary

In [None]:
css_summary = pd.read_csv(output_folder + '\\css_coverage_summary.csv')
css_summary

#### JavaScript Summary
- Here is the JS Summary 

In [None]:
js_coverage_summary = pd.read_csv(output_folder + '\\js_coverage_summary.csv')
js_coverage_summary