In this notebook, we'll show how we can get the weather using Requests. We haven't seen how to parse HTML yet, but luckily, there are also websites which are meant to be used in a terminal (using e.g. `curl`), and hence do not return HTML pages but plain text content.

In this example, we'll use https://wttr.in, which is such a web site returning a weather forecast. Try opening this site in your browser. As you can see by looking at the page source, this page is in fact formatted using HTML.

However, if we use `curl` (a terminal HTTP client, the following command might not work on your system), we get a different output. If you have `curl` installed, you can try this out using:

    curl wttr.in
    
Let's now try this with Requests.

In [1]:
import requests

In [2]:
print(requests.get('https://wttr.in').text)

Weather report: Antwerp, Belgium

  [38;5;226m    \   /    [0m Sunny
  [38;5;226m     .-.     [0m [38;5;214m28[0m °C[0m          
  [38;5;226m  ― (   ) ―  [0m [1m←[0m [38;5;226m15[0m km/h[0m      
  [38;5;226m     `-’     [0m 10 km[0m          
  [38;5;226m    /   \    [0m 0.0 mm[0m         
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Fri 31 Jul ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│ [38;5;226m   \  /[0m       Partly cloudy  │ [38;5;226m    \   /    [0m Sunny          │ [38;5;226m   \  /[0m       Partly cloudy  │ [38;5;226m   \  /[0m       Partly cloudy 

As you can see, we also get a nicely formatted textual response when using Requests.

So how does this website know which browser we're using? Let's take a look at the (request) headers again in Requests.

In [3]:
requests.get('https://wttr.in').request.headers

{'User-Agent': 'python-requests/2.24.0', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}

Notice something there? Indeed, Requests sets its `User-Agent` header announcing itself. For many scraping projects, you'll actually have to change this header in order to pose as a regular web browser. For the website here, let's try changing this header to something it doesn't recognize (the website will as such default to sending back HTML):

In [4]:
print(requests.get('https://wttr.in', headers={
    'User-Agent': 'Totally a real browser'
}).text[:200])

<html>
<head><title>Weather report: Antwerp, Belgium</title><meta property="og:image" content="http://wttr.in/_0pq.png" /><meta property="og:site_name" content="wttr.in" /><meta property="og:type" con
