## Working with Beautiful Soup

#### Load the necessary libraries

In [73]:
import requests
from bs4 import BeautifulSoup as bs
import csv

In [74]:
source = requests.get('http://coreyms.com')

soup = bs(source.text, 'lxml')

print(soup.prettify())

<!DOCTYPE html>
<html lang="en-US">
 <head>
  <meta charset="utf-8"/>
  <meta content="width=device-width, initial-scale=1" name="viewport"/>
  <!-- This site is optimized with the Yoast SEO plugin v15.4 - https://yoast.com/wordpress/plugins/seo/ -->
  <title>
   CoreyMS - Development, Design, DIY, and more
  </title>
  <meta content="Development, Design, DIY, and more" name="description"/>
  <meta content="index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1" name="robots"/>
  <link href="https://coreyms.com/" rel="canonical"/>
  <link href="https://coreyms.com/page/2" rel="next"/>
  <meta content="en_US" property="og:locale"/>
  <meta content="website" property="og:type"/>
  <meta content="CoreyMS - Development, Design, DIY, and more" property="og:title"/>
  <meta content="Development, Design, DIY, and more" property="og:description"/>
  <meta content="https://coreyms.com/" property="og:url"/>
  <meta content="CoreyMS" property="og:site_name"/>
  <meta content

In [75]:
article = soup.find('article')
print(article.prettify())

<article class="post-1670 post type-post status-publish format-standard has-post-thumbnail category-development category-python tag-gzip tag-shutil tag-zip tag-zipfile entry" itemscope="" itemtype="https://schema.org/CreativeWork">
 <header class="entry-header">
  <h2 class="entry-title" itemprop="headline">
   <a class="entry-title-link" href="https://coreyms.com/development/python/python-tutorial-zip-files-creating-and-extracting-zip-archives" rel="bookmark">
    Python Tutorial: Zip Files – Creating and Extracting Zip Archives
   </a>
  </h2>
  <p class="entry-meta">
   <time class="entry-time" datetime="2019-11-19T13:02:37-05:00" itemprop="datePublished">
    November 19, 2019
   </time>
   by
   <span class="entry-author" itemprop="author" itemscope="" itemtype="https://schema.org/Person">
    <a class="entry-author-link" href="https://coreyms.com/author/coreymschafer" itemprop="url" rel="author">
     <span class="entry-author-name" itemprop="name">
      Corey Schafer
     </spa

In [76]:
# Get the headline.
headline = article.h2.a.text
print(headline)

Python Tutorial: Zip Files – Creating and Extracting Zip Archives


In [77]:
# Get the Summary text
summary = article.find('div', class_='entry-content').p.text
print(summary)

In this video, we will be learning how to create and extract zip archives. We will start by using the zipfile module, and then we will see how to do this using the shutil module. We will learn how to do this with single files and directories, as well as learning how to use gzip as well. Let’s get started…


In [78]:
# Get the video link
vid_src = article.find('iframe', class_ = 'youtube-player')['src']
#print(vid_src)
vid_id = vid_src.split('/')[4]
vid_id = vid_id.split('?')[0]
print(vid_id)

z0gguhEmWiY


In [79]:
# Create Youtube Link
yt_link = f'https://www.youtube.com/watch?v={vid_id}'
print(yt_link)

https://www.youtube.com/watch?v=z0gguhEmWiY


#### Get all the data

In [80]:
csv_file = open('test_scrape.csv', 'w')
csv_writter = csv.writer(csv_file)
csv_writter.writerow(['headline', 'summary', 'video_link'])


for article in soup.find_all('article'):
    
    #Get Headline
    headline = article.h2.a.text
    print(headline)
    
    #Get Summary
    summary = article.find('div', class_='entry-content').p.text
    print(summary)
    
    try:
    
        vid_src = article.find('iframe', class_ = 'youtube-player')['src']
        #print(vid_src)
        vid_id = vid_src.split('/')[4]
        vid_id = vid_id.split('?')[0]
        # Create Youtube Link
        yt_link = f'https://www.youtube.com/watch?v={vid_id}'
    except Exception as e:
        yt_link = None
        
    print(yt_link)
    
    print()
    
    
    csv_writter.writerow([headline, summary, yt_link])
csv_file.close()

Python Tutorial: Zip Files – Creating and Extracting Zip Archives
In this video, we will be learning how to create and extract zip archives. We will start by using the zipfile module, and then we will see how to do this using the shutil module. We will learn how to do this with single files and directories, as well as learning how to use gzip as well. Let’s get started…
https://www.youtube.com/watch?v=z0gguhEmWiY

Python Data Science Tutorial: Analyzing the 2019 Stack Overflow Developer Survey
In this Python Programming video, we will be learning how to download and analyze real-world data from the 2019 Stack Overflow Developer Survey. This is terrific practice for anyone getting into the data science field. We will learn different ways to analyze this data and also some best practices. Let’s get started…
https://www.youtube.com/watch?v=_P7X8tMplsw

Python Multiprocessing Tutorial: Run Code in Parallel Using the Multiprocessing Module
In this Python Programming video, we will be learni

In [81]:
## Get all data from this website

csv_file = open('test_scrape_1.csv', 'w')
csv_writer = csv.writer(csv_file)
csv_writer.writerow(['headline', 'summary', 'video_link'])

page_number = 1
while True:
    url = f'https://coreyms.com/page/{page_number}'
    response = requests.get(url)
    soup = BeautifulSoup(response.content, 'html.parser')

    articles = soup.find_all('article')
    if not articles:
        break  # No more pages to scrape

    for article in articles:
        #Get Headline
        headline = article.h2.a.text
        print(headline)

        #Get Summary
        summary = article.find('div', class_='entry-content').p.text
        print(summary)

        try:
            vid_src = article.find('iframe', class_ = 'youtube-player')['src']
            vid_id = vid_src.split('/')[4]
            vid_id = vid_id.split('?')[0]
            # Create Youtube Link
            yt_link = f'https://www.youtube.com/watch?v={vid_id}'
        except Exception as e:
            yt_link = None

        print(yt_link)
        print()

        csv_writer.writerow([headline, summary, yt_link])

    page_number += 1

csv_file.close()


Python Tutorial: Zip Files – Creating and Extracting Zip Archives
In this video, we will be learning how to create and extract zip archives. We will start by using the zipfile module, and then we will see how to do this using the shutil module. We will learn how to do this with single files and directories, as well as learning how to use gzip as well. Let’s get started…
https://www.youtube.com/watch?v=z0gguhEmWiY

Python Data Science Tutorial: Analyzing the 2019 Stack Overflow Developer Survey
In this Python Programming video, we will be learning how to download and analyze real-world data from the 2019 Stack Overflow Developer Survey. This is terrific practice for anyone getting into the data science field. We will learn different ways to analyze this data and also some best practices. Let’s get started…
https://www.youtube.com/watch?v=_P7X8tMplsw

Python Multiprocessing Tutorial: Run Code in Parallel Using the Multiprocessing Module
In this Python Programming video, we will be learni

Python Requests Tutorial: Request Web Pages, Download Images, POST Data, Read JSON, and More
In this Python Programming Tutorial, we will be learning how to use the Requests library. The Requests library allows us to send HTTP requests and interact with web pages. We will be learning how to grab the source code of a site, download images, POST form data to routes, read JSON responses, perform authentication, and more. Let’s get started…
https://www.youtube.com/watch?v=tb8gHvYlCFs

Python Django Tutorial: Deploying Your Application (Option #2) – Deploy using Heroku
In this Python Django Tutorial, we will be learning how to deploy our application to Heroku. Heroku is a platform that abstracts away a lot of the low-level system administration and allows us to easily deploy, update, and rollback changes for our application. Let’s get started…
https://www.youtube.com/watch?v=6DI_7Zja8Zc

Python Django Tutorial: Full-Featured Web App Part 13 – Using AWS S3 for File Uploads
In this Python Dja

Linux/Mac Terminal Tutorial: The Grep Command – Search Files and Directories for Patterns of Text
In this Linux/Mac terminal tutorial, we will be learning how to use the grep command. The grep command allows us to search files and directories for patterns of text. You can also pipe the output of one command into grep to get certain matches. It’s extremely useful once you learn the ins and outs. Let’s get started…
https://www.youtube.com/watch?v=VGgTmxXp7xQ

How to Run Linux/Bash on Windows 10 Using the Built-In Windows Subsystem for Linux
In this video, we will be learning how to run Linux on Windows using the new Windows Subsystem for Linux that comes with Windows 10. This is an excellent way to run Bash on a Windows machine. It allows you to use all of the Bash commands we are used to using on Linux within a Windows system. We will be showing how to enable and install Linux on Windows and also go over a quick overview to see how this works. Let’s get started…
https://www.youtube.com/

How to Create a Network of Machines in VirtualBox with SSH Access
In this video, we’ll be learning how to clone Virtual Machines, add these machines to a network so they can communicate with each other, make sure they have internet access, and also set up SSH so that we are able to SSH into these machines from our host machine. This will allow us to pretty much build an entire virtual lab that we can use to test all kinds of different software. So after we’re done, this will give us the ability to quickly spin up a new VM that behaves just like a real machine on our network. Let’s get started.
https://www.youtube.com/watch?v=S7jD6nnYJy0

VirtualBox: How to Use Snapshots
In this video, we will be learning how to use snapshots within VirtualBox. Snapshots are great for saving a machine state and being able to revert back to a previous time. Let’s get started.
https://www.youtube.com/watch?v=Qte4X-rdr2Q

Python Beginner Tutorials – Complete Series
Welcome to a nine-part series on Python P

Python Tutorial: File Objects – Reading and Writing to Files
In this Python Tutorial, we will be learning how to read and write to files. You will likely come into contact with file objects at some point while using Python, so knowing how to read and write from them is extremely important. We will learn how to read and write from simple text files, open multiple files at once, and also how to copy image binary files. Let’s get started.
https://www.youtube.com/watch?v=Uh2ebFW8OYM

Python Tutorial: OS Module – Use Underlying Operating System Functionality
In this Python Tutorial, we will be going over the ‘os’ module. The os module allows us to access functionality of the underlying operating system. So we can perform tasks such as: navigate the file system, obtain file information, rename files, search directory trees, fetch environment variables, and many other operations. We will cover a lot of what the os module has to offer in this tutorial, so let’s get started.
https://www.youtube

SQL for Beginners: SELECT – Retrieving Records from Your Database
In this video we will continue learning SQL Basics by retrieving records from our database using the SELECT statement. Once we learn how to use the SELECT statement, we will learn how to filter records that match a certain criteria with the WHERE Clause. Lastly, we will learn how to sort our results using the ORDER BY statement. Let’s get started:
https://www.youtube.com/watch?v=-FPVPcq28r4

Git: Fixing Common Mistakes and Undoing Bad Commits
In this video we will look at some common mistakes in Git and how we can fix these mistakes. Specifically we will cover how to discard changes since your last commit, amending commits, cherry-picking hashes, resetting to a specific commit, and reverting to a specific commit. Let’s get started:
https://www.youtube.com/watch?v=FdZecVxzJbk

Setting up a Python Development Environment in Eclipse
In this video, we will be setting up a Python development environment in Eclipse using the P

Programming Terms: Mutable vs Immutable
In this programming terms video, we will be going over the difference between mutable and immutable objects. An immutable object is an object whose state cannot be modified after it is created. This is in contrast to a mutable object, which can be modified after it is created. Let’s take a look at some examples as to what exactly this means and why it is important to know.
https://www.youtube.com/watch?v=5qQQ3yzbKp8

Python: Namedtuple – When and why should you use namedtuples?
Named Tuples in Python are High-performance container datatypes. What advantage do namedtuples have over regular tuples and when should you use them? In this video, we’ll take a look at namedtuples and why you should use them.
https://www.youtube.com/watch?v=GfxJYp9_nJA

Programming Terms: Idempotence
In this programming terms video, we will be going over Idempotence. Idempotence is the property of certain operations in mathematics and computer science, that can be applied

Make an American Girl Doll Bed

None

Sublime Text 2: Setup, Package Control, and Settings
Here we’ll do a quick walkthrough on setting up a development environment using Sublime Text 2.
https://www.youtube.com/watch?v=uOMk8MlE_v4

Using Font Awesome in Desktop Applications (OS X)
A Quick walkthrough on how you can download and use the Font Awesome icon font in your Mac desktop applications such as Photoshop, GIMP, Illustrator, Pages, and more.
https://www.youtube.com/watch?v=OlpVKUpraao

Build a Platform Bed Frame

None

Make a Raised Dog Feeder

None

Prevent Picasa from Scanning Folders
I love using Picasa for viewing and editing my photos. What I don’t love is that it automatically scans and imports tons of unwanted photos by default. This creates a ton of clutter and makes it difficult for me to find the actual photos I want to work with. Fortunately it is easy to prevent Picasa from scanning folders.
None

How to Build a Paver Patio

None

Quick Tip: Use a Wooden Pallet as a Lumb

AttributeError: 'NoneType' object has no attribute 'a'