# Ejercicio opcional de Web Scraping

Este ejercicio consiste en extraer datos de una página web, procesarlos y guardarlos en un fichero `csv`. Para ello, debes:

1. Extraer los artículos en la página de inicio de [https://slashdot.org/](https://slashdot.org/) utilizando `BeautifulSoup`.
2. Procesar los datos y guardarlos en un `DataFrame`.
3. Crear un fichero `csv` a partir de dicho `DataFrame`.

## Importar librerías

In [112]:
import requests

from bs4 import BeautifulSoup

import pandas as pd

## Hacer scraping de artículos

In [113]:
response = requests.get("https://slashdot.org/")

In [123]:
if response.status_code == 200:
    soup = BeautifulSoup(response.content, 'lxml')
    articles = soup.find_all('article', class_='fhitem')

In [128]:
data = []

for a in articles:
    title = a.find('span', class_='story-title').text.strip()
    link_element = a.find('a', class_='story-sourcelnk')
    link = link_element['href'].strip() if link_element and 'href' in link_element.attrs else None
    description = a.find('div', class_='body').text.strip()
    date = a.time.attrs['datetime']

    data.append({'Title': title, 'Link': link, 'Description': description, 'Date': date})

## Guardar dataframe

In [131]:
df = pd.DataFrame(data)

In [132]:
df.to_csv('slashdot_articles.csv', index=False, encoding='utf-8')

In [133]:
df = pd.read_csv('slashdot_articles.csv')
df

Unnamed: 0,Title,Link,Description,Date
0,Mr. Cooper Hackers Stole Personal Data on 14 M...,https://techcrunch.com/2023/12/18/mr-cooper-ha...,Hackers stole the sensitive personal informati...,"on Monday December 18, 2023 @02:20PM"
1,OpenAI Lays Out Plan For Dealing With Dangers ...,https://www.washingtonpost.com/technology/2023...,"OpenAI, the AI company behind ChatGPT, laid ou...","on Monday December 18, 2023 @01:40PM"
2,Imran Khan Deploys AI Clone To Campaign From B...,https://www.theguardian.com/world/2023/dec/18/...,AI allowed Pakistan's former prime minister Im...,"on Monday December 18, 2023 @01:00PM"
3,US Lawmakers Warn Biden To Probe EU Targeting ...,https://finance.yahoo.com/news/exclusive-us-la...,A bipartisan group of lawmakers has written to...,"on Monday December 18, 2023 @12:20PM"
4,Lawmakers Push DOJ To Investigate Apple Follow...,https://www.theverge.com/2023/12/18/24006037/a...,"Following a tumultuous few weeks for Beeper, w...","on Monday December 18, 2023 @11:40AM"
5,Southwest Will Pay a $140 Million Fine For Its...,,Southwest Airlines is still paying for its mel...,"on Monday December 18, 2023 @11:00AM"
6,Documents Reveal Hidden Problems at Russia's N...,,An anonymous reader shares a report: As Russia...,"on Monday December 18, 2023 @10:20AM"
7,Deloitte Is Looking To AI To Help Avoid Mass L...,https://www.bloomberg.com/news/articles/2023-1...,The giants of the consulting world face an unu...,"on Monday December 18, 2023 @09:40AM"
8,Adobe Abandons $20 Billion Acquisition of Figm...,https://www.theverge.com/2023/12/18/24005996/a...,Following mounting pressure from regulators in...,"on Monday December 18, 2023 @09:00AM"
9,2023's Online 'Advent Calendars' Challenge Pro...,,It's a geek tradition that started online back...,"on Monday December 18, 2023 @07:34AM"
