# What to do when your code is slow

1. Make sure it works and then make it fast
2. Use profilers:
- https://julien.danjou.info/blog/2015/guide-to-python-profiling-cprofile-concrete-case-carbonara
3. Use multicore:
- https://pymotw.com/2/multiprocessing/basics.html

In [1]:
from multiprocessing import Pool
from time import sleep, clock, time

def slow_function(e):
    sleep(2)
    print("Finished",e)
    return e*10

# without
print('Normal loop')
elements = range(8)
time1 = time()
processed_elements = list(map(slow_function, elements))
time2 = time()
print('Took %0.3f s' % (time2-time1))
print(processed_elements)

# with
print('Fast loop')
pool = Pool(8)
elements = range(8)
time1 = time()
processed_elements = pool.map(slow_function, elements)
time2 = time()
print('Took %0.3f s' % (time2-time1))
pool.close()
print(processed_elements)

Normal loop
Finished 0
Finished 1
Finished 2
Finished 3
Finished 4
Finished 5
Finished 6
Finished 7
Took 16.019 s
[0, 10, 20, 30, 40, 50, 60, 70]
Fast loop
Finished 0
Finished 4
Finished 1
Finished 6
Finished 2
Finished 5
Finished 3
Finished 7
Took 2.009 s
[0, 10, 20, 30, 40, 50, 60, 70]


In [2]:
import pandas as pd


In [3]:
wsDF = pd.read_csv('data/websites.csv')

In [28]:
sites = ['http://www.'+x for x in list(wsDF['url'])]

In [31]:
user_agent = {'User-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.75 Safari/537.36'}

import requests
def process(url):
    try:
        result = requests.get(url, headers = user_agent, timeout=10).content
        print("Retrieved url : {}".format(url))
        return { url : result }
    except:
        print("Could not retrieve url : {}".format(url))
        return { url : ""}

process(sites[0])

Retrieved url : http://www.printmanagement.com


{'http://www.printmanagement.com': b'<html><head><META HTTP-EQUIV="refresh" CONTENT="0;URL=/cgi-sys/defaultwebpage.cgi"></head><body></body></html>\n'}

In [32]:
import pickle
pool = Pool(8)
time1 = time()
processed_elements = pool.map(process, sites)
time2 = time()
print('Took %0.3f s' % (time2-time1))
pool.close()
with open('data/sites.p', 'wb') as f:
    pickle.dump(processed_elements, f)

Retrieved url : http://www.printmanagement.com
Retrieved url : http://www.syvjournal.com
Could not retrieve url : http://www.achieveagency.com
Could not retrieve url : http://www.studeo.com
Retrieved url : http://www.netdriven.com
Retrieved url : http://www.briggscaldwell.com
Retrieved url : http://www.raincloudmedia.com
Retrieved url : http://www.actimediadigital.com
Retrieved url : http://www.arcticicearena.com
Could not retrieve url : http://www.media-corp.com
Could not retrieve url : http://www.meringcarson.com
Retrieved url : http://www.p11.com
Retrieved url : http://www.mccabe-duval.com
Retrieved url : http://www.monacorarecoins.com
Retrieved url : http://www.searchlogic.com
Retrieved url : http://www.victorysign.com
Retrieved url : http://www.earworks.com
Retrieved url : http://www.haasortho.com
Retrieved url : http://www.ianinc.net
Retrieved url : http://www.arbico-organics.com
Retrieved url : http://www.creativesignresources.com
Could not retrieve url : http://www.ensemblemedi

Retrieved url : http://www.worldwidebranding.com
Retrieved url : http://www.ecardsystems.com
Retrieved url : http://www.entermarketing.com
Retrieved url : http://www.pbpmedia.com
Retrieved url : http://www.ellcreative.com
Retrieved url : http://www.olsoncom.com
Could not retrieve url : http://www.searcylaw.com
Retrieved url : http://www.indianapolissignworks.com
Retrieved url : http://www.key-ads.com
Retrieved url : http://www.visionsink.com
Retrieved url : http://www.ulrichsign.com
Retrieved url : http://www.teligent.com
Retrieved url : http://www.besalesteams.com
Retrieved url : http://www.dailytimes.com
Retrieved url : http://www.businessevolutionsinc.com
Retrieved url : http://www.trmiller.com
Retrieved url : http://www.playhaven.com
Retrieved url : http://www.palmtreecreative.com
Retrieved url : http://www.hyphendigital.com
Retrieved url : http://www.rvue.com
Retrieved url : http://www.lwra.com
Retrieved url : http://www.mogoartsmarketing.com
Retrieved url : http://www.kmgconsulta

Retrieved url : http://www.excelsiormarketingsolutions.com
Retrieved url : http://www.directresponsegroup.com
Retrieved url : http://www.useallfive.com
Retrieved url : http://www.interactrv.com
Retrieved url : http://www.quell.com
Retrieved url : http://www.boxing-clever.com
Retrieved url : http://www.profitstreams.com
Retrieved url : http://www.usigns.com
Retrieved url : http://www.xbiz.com
Retrieved url : http://www.thehdg.biz
Retrieved url : http://www.omegabusinessconsulting.com
Retrieved url : http://www.wagstaffworldwide.com
Retrieved url : http://www.tmarketinggroup.net
Could not retrieve url : http://www.bridgevine.com
Retrieved url : http://www.dentagency.com
Retrieved url : http://www.mmgfulfillment.com
Retrieved url : http://www.herogrp.com
Retrieved url : http://www.simantel.com
Retrieved url : http://www.discovercg.com
Retrieved url : http://www.garrowmediallc.com
Retrieved url : http://www.forevercomm.com
Could not retrieve url : http://www.ozoneonline.com
Retrieved url :

Retrieved url : http://www.willowstagency.com
Retrieved url : http://www.searchwirellc.com
Retrieved url : http://www.tdfischer.com
Retrieved url : http://www.tomorrownetworks.com
Retrieved url : http://www.southincnashville.com
Retrieved url : http://www.arsenalnewyork.com
Retrieved url : http://www.accomplice.io
Retrieved url : http://www.imaginalmarketing.com
Retrieved url : http://www.ettsi.com
Retrieved url : http://www.delcosales.com
Retrieved url : http://www.holdcom.com
Retrieved url : http://www.testrite.com
Retrieved url : http://www.blueforeststudios.com
Retrieved url : http://www.spotlitemarketing.com
Retrieved url : http://www.creativemarketingplus.com
Retrieved url : http://www.seminternational.com
Retrieved url : http://www.blueberries.com
Retrieved url : http://www.cnpsigns.com
Retrieved url : http://www.nyslpromotions.com
Retrieved url : http://www.farringtonhighschool.org
Retrieved url : http://www.thehannongroup.com
Retrieved url : http://www.aberdeennews.com
Retriev

Retrieved url : http://www.jetsetstudios.com
Could not retrieve url : http://www.trustworkz.com
Retrieved url : http://www.revelryagency.com
Retrieved url : http://www.twhcollectibles.com
Could not retrieve url : http://www.webstop.com
Retrieved url : http://www.asv1.com
Could not retrieve url : http://www.silverbacknetwork.com
Retrieved url : http://www.emarketingconcepts.com
Retrieved url : http://www.bellsigns.com
Retrieved url : http://www.aspenlabs.com
Retrieved url : http://www.emediamd.com
Retrieved url : http://www.reachpros.com
Retrieved url : http://www.midwestsignco.com
Retrieved url : http://www.premierlegalmarketing.com
Retrieved url : http://www.lookingo.com
Retrieved url : http://www.summits-online.com
Retrieved url : http://www.nexusenterprisesolutions.com
Retrieved url : http://www.tidbit.mx
Retrieved url : http://www.cultivatorads.com
Retrieved url : http://www.tebww.com
Retrieved url : http://www.capzool.com
Retrieved url : http://www.interlexusa.com
Retrieved url : 

Retrieved url : http://www.tnblive.com
Retrieved url : http://www.bamko.net
Retrieved url : http://www.afligo.com
Retrieved url : http://www.olprint.nl
Retrieved url : http://www.perronegrp.com
Retrieved url : http://www.adantelife.com
Retrieved url : http://www.moneypages.com
Retrieved url : http://www.knapsack.com
Retrieved url : http://www.holisticwebpresence.com
Retrieved url : http://www.glimmerchicago.com
Retrieved url : http://www.crst.net
Retrieved url : http://www.finderskeeperscard.com
Retrieved url : http://www.fusionllc.net
Retrieved url : http://www.retailmarketing.com.mx
Retrieved url : http://www.kennickell.com
Retrieved url : http://www.artina.com
Retrieved url : http://www.indyimaging.com
Could not retrieve url : http://www.leanmeanfightingmachine.co.uk
Retrieved url : http://www.victoryhcc.com
Retrieved url : http://www.dolphingraphics.com
Retrieved url : http://www.galacticmarketingllc.com
Retrieved url : http://www.kerleysigns.com
Retrieved url : http://www.epicolor

Retrieved url : http://www.onetechnologies.net
Retrieved url : http://www.sendmepatients.com
Retrieved url : http://www.scadirect.com
Retrieved url : http://www.e5aintegratedmarketing.com
Retrieved url : http://www.eaglexhibit.com
Retrieved url : http://www.jws-associates.co.uk
Retrieved url : http://www.suttersmill.com
Retrieved url : http://www.pposinc.com
Could not retrieve url : http://www.ipak.com
Retrieved url : http://www.cyclonix.com
Retrieved url : http://www.theleadswarehouse.com
Retrieved url : http://www.xplocialteambuilders.com
Retrieved url : http://www.oneworldsf.com
Retrieved url : http://www.ladowntownnews.com
Retrieved url : http://www.carouselsigns.com
Retrieved url : http://www.bkbusinessconsultinginc.com
Retrieved url : http://www.badmonkeycircus.com
Retrieved url : http://www.youvisit.com
Retrieved url : http://www.winbrook.com
Retrieved url : http://www.freemanjournal.net
Retrieved url : http://www.revonsystems.net
Retrieved url : http://www.chicagomag.com
Retrie

Retrieved url : http://www.lhsigns.com
Retrieved url : http://www.recondistribution.com
Retrieved url : http://www.bswusa.com
Retrieved url : http://www.tambamedia.com
Retrieved url : http://www.gelcomm.com
Retrieved url : http://www.onlinemarijuanadesign.com
Retrieved url : http://www.marketorlando.com
Retrieved url : http://www.tarrinc.com
Retrieved url : http://www.directchannelsgroup.com
Retrieved url : http://www.admarvel.com
Retrieved url : http://www.employmentguide.com
Retrieved url : http://www.bluewingdirect.com
Retrieved url : http://www.b1.com
Retrieved url : http://www.monsterweb.net
Retrieved url : http://www.loyaltysuperstore.com
Retrieved url : http://www.elkinsadvertising.com
Retrieved url : http://www.jaxkarwash.net
Retrieved url : http://www.jnsmarketinginc.com
Retrieved url : http://www.thealchemediaproject.com
Retrieved url : http://www.acmeapparel.com
Retrieved url : http://www.mediaone.com
Could not retrieve url : http://www.rodeoagency.com.au
Retrieved url : htt

Retrieved url : http://www.globaleconomicsgroup.com
Retrieved url : http://www.mmmarketinginc.com
Retrieved url : http://www.berrynetwork.com
Retrieved url : http://www.c2creative.com
Retrieved url : http://www.elevate-staffing.com
Could not retrieve url : http://www.creativeasylum.com
Retrieved url : http://www.qmarketing.biz
Retrieved url : http://www.horizonhouse.com
Retrieved url : http://www.firstclasssolutionsinc.com
Retrieved url : http://www.tomorrowsonlinemarketing.com
Retrieved url : http://www.yescooutdoormedia.com
Retrieved url : http://www.zone5.com
Retrieved url : http://www.berline.com
Retrieved url : http://www.bouldervideo.media
Retrieved url : http://www.buadlab.com
Retrieved url : http://www.govbizresults.com
Retrieved url : http://www.334marketing.com
Could not retrieve url : http://www.attik.com
Retrieved url : http://www.fivehype.com
Retrieved url : http://www.integramarketinggroup.com
Retrieved url : http://www.signaturecreative.com
Retrieved url : http://www.the

Retrieved url : http://www.newvisionmarketingny.com
Retrieved url : http://www.militarymediainc.com
Could not retrieve url : http://www.wundermandc.com
Retrieved url : http://www.goldandmae.com
Could not retrieve url : http://www.marketsharegroup.com
Retrieved url : http://www.fastkitpack.com
Could not retrieve url : http://www.ecommercepositioning.com
Retrieved url : http://www.npa.net
Retrieved url : http://www.insidemedia.com
Retrieved url : http://www.trashtalkfcm.com
Retrieved url : http://www.comivo.com
Retrieved url : http://www.sparksandhoney.com
Retrieved url : http://www.goteamdirect.com
Retrieved url : http://www.ramarketing.com
Retrieved url : http://www.clarity.fm
Retrieved url : http://www.zeisgroup.com
Could not retrieve url : http://www.strata-g.com
Retrieved url : http://www.imsgroup.com.au
Retrieved url : http://www.campuscommandos.com
Retrieved url : http://www.keiler.com
Retrieved url : http://www.expolinc.com
Retrieved url : http://www.dataclover.com
Retrieved url 

Retrieved url : http://www.ideascollide.com
Retrieved url : http://www.gochargenetworks.com
Retrieved url : http://www.calmktg.com
Retrieved url : http://www.hudson-gray.com
Retrieved url : http://www.bpgadvertising.com
Retrieved url : http://www.therockmedia.com
Retrieved url : http://www.northshoremarketinggroup.com
Retrieved url : http://www.uhaps.com
Retrieved url : http://www.pminet.com
Retrieved url : http://www.netword.com
Retrieved url : http://www.arrowheadadv.com
Retrieved url : http://www.eyeonpharma.com
Retrieved url : http://www.compasspointenc.com
Retrieved url : http://www.refcu.org
Retrieved url : http://www.mediaimpressions.com
Retrieved url : http://www.ff0000.com
Retrieved url : http://www.tier1eventmanagement.com
Retrieved url : http://www.sandsregency.com
Retrieved url : http://www.britishpolio.org.uk
Retrieved url : http://www.allthingshospitality.com
Could not retrieve url : http://www.astoneagency.com
Retrieved url : http://www.successfulthinkersnetwork.com
Retr

In [37]:
list(filter(lambda x: 'http://www.markmywordsmedia.com' in x, processed_elements ))

[{'http://www.markmywordsmedia.com': b'<!DOCTYPE html><html\nlang="en"><head><link\nrel="stylesheet" type="text/css" href="http://www.markmywordsmedia.com/wp-content/cache/minify/7f908.css" media="all" /><meta\ncharset="utf-8"><meta\nname="viewport" content="width=device-width, initial-scale=1"><title>Local SEO Company | Lead Generation &amp; Internet Marketing</title><link\nrel="shortcut icon" type="image/x-icon" href="http://www.markmywordsmedia.com/wp-content/uploads/2013/09/favicon.png"><link\nrel=\'dns-prefetch\' href=\'//s.w.org\' /> <script type="text/javascript">/*<![CDATA[*/window._wpemojiSettings={"baseUrl":"https:\\/\\/s.w.org\\/images\\/core\\/emoji\\/2.3\\/72x72\\/","ext":".png","svgUrl":"https:\\/\\/s.w.org\\/images\\/core\\/emoji\\/2.3\\/svg\\/","svgExt":".svg","source":{"concatemoji":"http:\\/\\/www.markmywordsmedia.com\\/wp-includes\\/js\\/wp-emoji-release.min.js?ver=9759ecee7a7647cb36a6bb00562d42a7"}};!function(a,b,c){function d(a){var b,c,d,e,f=String.fromCharCode;if

Exercise
-----------

1. Download the contents of the websites from `data/websites.csv` using `requests` module (`pip install requests`):
    - write a function that accepts URL as the parameter and returns the content of the website
2. How fast do you think you can download 2000 websites?