<center><h1>Client List with Python </h1></center>

<h3>Using just a few lines of code in Python, we can take a url from a website that lists the names and phone numbers of potential clients and turn it into a spreadsheet.</h3>

<p>We'll start off by installing modules in Python that will be needed to do this task.</p>
<p>These modules are namely:</p>
    <ul>
    <li>BeautifulSoup (bs4)</li>
    <li>requests</li>
    </ul>

In [3]:
pip install bs4

Note: you may need to restart the kernel to use updated packages.


In [4]:
pip install requests




<h3>Now we'll want to declare our url as a variable.</h3>

<p>Let's upload a url of a search in yellowpages.com that shows the top 30 IT companies in Los Angeles</p>

In [98]:
it_nearme_url = "https://www.yellowpages.com/search?search_terms=it+companies&geo_location_terms=Los+Angeles%2C+CA"

<h3>Before we get started on the actual coding:</h3>

<p>Let's import the modules we that just installed into Python</p>
<p>We'll also import the <b>csv</b> module in order to use it's <b>writer</b> function. This will help us to save our work in a .csv  file later.</p>

In [143]:
import requests

In [137]:
from bs4 import BeautifulSoup

from csv import writer

<p>The next step is to use the <b>requests</b> and <b>BeautifulSoup</b> modules to get the html code from the url, then parse through the content so that we can work with it in our code.</p>

In [138]:
r = requests.get(it_nearme_url)

soup = BeautifulSoup(r.content, "html.parser")

<h3>It's time to see if the website's html code is brought into our workspace.</h3>

In [139]:
print(soup)

<!DOCTYPE html>
<html lang="en"><head><meta content="utf-8" name="charset"/><meta content="IE=edge" http-equiv="X-UA-Compatible"/><meta content="width=device-width, initial-scale=1.0, minimum-scale=1, user-scalable=no" name="viewport"/><title>Best 30 It Companies in Los Angeles, CA with Reviews</title><meta content="It Companies in Los Angeles on YP.com. See reviews, photos, directions, phone numbers and more for the best Computer Technical Assistance &amp; Support Services in Los Angeles, CA." name="description"/><meta content="noindex, follow" name="robots"/><meta content="app-id=284806204, app-argument=ypmobile://srp?search_category=it companies&amp;latitude=34.0522342&amp;longitude=-118.2436849" name="apple-itunes-app"/><meta content="website" property="og:type"/><meta content="Best 30 It Companies in Los Angeles, CA with Reviews" property="og:title"/><meta content="It Companies in Los Angeles on YP.com. See reviews, photos, directions, phone numbers and more for the best Computer 

<h3>Time to put our data together!</h3>

<p>After going through the html of the webpage, we can find that the section which holds the information for each company is within a "div" tag with the "class" called "v-card"</p>

<p>We'll declare that as the variable "lists".</p>

<p>From there, we'll create a "for" loop and call it "list".</p>
<p>Using the "find()" function, we can declare that every "h2" within "lists" that has the class "n" will be called "company".</p>
<p>We'll do the same for every "div" that has the class "phone", and declare it as the variable "phone_num".</p>
<p>In order to keep the final product from including html code, we can add ".text" to the end of each of our new "list.find()" functions.</p>
<p>To put our results into an array, we'll want to declare our array as "info" and place "company" and "phone_num" in the brackets.</p>
<p>Finally, we can use the "print()" function to see the results.</p>

In [141]:
lists = soup.find_all( "div", class_ = "v-card" )


for list in lists:
    company = list.find( 'h2', class_ = "n").text
    phone_num = list.find('div', class_ = "phone").text
    info = [company, phone_num]
    print(info)

['My Computer Works', '(888) 840-0484']
['1. L. A. Computer Works', '(310) 277-9799']
['2. Desktop Conquest', '(213) 321-1869']
['3. My Computer Works Inc', '(877) 221-0118']
['4. Geeks on Site', '(877) 276-1341']
['5. PC AND WEB PROS', '(888) 823-7767']
['6. K-Tech Computer Services', '(213) 739-3400']
['7. Nyps Computers', '(323) 737-4748']
['8. TeamLogic IT', '(310) 385-8548']
['9. Compumax', '(310) 288-0000']
['10. Naxym', '(213) 279-2010']
['11. Adexa Inc', '(310) 642-2100']
['12. Three Dot Solutions', '(323) 256-3864']
['13. Tekreach Solutions', '(888) 890-1935']
['14. VMDOGGI Computer Shops', '(877) 323-6444']
['15. Tech Help Los Angeles', '(213) 986-5722']
['16. Thomas Bolton Network Services', '(323) 497-3441']
['17. Tech Tutor LA', '(310) 592-5630']
['18. Avance Tech Solutions Inc', '(213) 626-1155']
['19. Garland Connect', '(213) 489-1800']
['20. Daekey Technology', '(213) 387-0300']
['21. Bankinfra Technology', '(213) 739-7900']
['22. Stars', '(323) 266-8114']
['23. Simplew

<h3>Looking Good!</h3>

<p>Our data is now matched up in arrays which we can use to make a spreadsheet.</p>

<p>The final step here is to save our data into a .csv file.</p>

<p>We can use the "writer()" function from the csv module in Python.</p>

<p>Start off by indenting the previous code so that it can be subordinate to the "open()" function that we will write on top of it.</p>

<p>Write your open() function and nest inside of it these things:
    <ul>
        <li>The name of the file you wish to create.</li>
        <li>The letter "w", which indicated that you want to "write" a new file or replace the one with this name.</li>
        <li>The encoding format the file uses (utf8 should be fine).</li>
        <li>And make "newline" = ''</li>
    </ul>

</p>

<p>You can end this line by declaring it as "f" and placing a colon at the end to show that the rest of the code lines spring out from it.</p>

<p>On top of our origninal "for" loop, we will write a few lines using our writer() function in conjunction with it.</p>

<p>Declare writer(f) as "thewriter". Then declare the values you wish to use in .csv file's header row as "header"</p>

<p>Make another line of code that utilizes the "writerow()" function at the end of our "thewriter" variable</p>

<p>We will use the writerow() function one more time, but this time we will replace the print() function at the bottom of our for loop and nest our "info" variable inot that instead.</p>

In [142]:
lists = soup.find_all( "div", class_ = "v-card" )

with open('client_list.csv', 'w', encoding = 'utf8', newline = '') as f:
    thewriter = writer(f)
    header = ['Client', 'Contact']
    thewriter.writerow(header)
    for list in lists:
        company = list.find( 'h2', class_ = "n").text
        phone_num = list.find('div', class_ = "phone").text
        info = [company, phone_num]
        thewriter.writerow(info)

<h3>Congratulations!</h3>

<p>A new .csv file should be saved into the folder that houses your Jupyter Notebook.</p>

<p>We can open up that .csv in Excel and see if there are any finishing touches we would like to make in our dataset.</p>