# Web Scraping

In [27]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

html_text = requests.get('https://community.infineon.com/?profile.language=en').text
soup = BeautifulSoup(html_text, 'lxml')

titles = soup.find_all('div', class_='subject')

data = []

for index, title in enumerate(titles):
    # Find the body and time elements within the specific title
    body = title.find_next('div', class_='full-body body').text.strip()
    time = title.find_next('span', class_='time').text.strip().replace(' ', '')

    data.append({
        'Title': title.text.strip(),
        'Body': body,
        'Time': time
    })

df = pd.DataFrame(data)

# Save DataFrame to CSV
df.to_csv('infineon_community.csv', index=False)
df

Unnamed: 0,Title,Body,Time
0,Performance comparison between TC399XX and TRA...,"Hello,\nDo you have any benchmark that compare...","May4,2024"
1,Controlling AC load with two mosfets and 1ED31...,"Hi,I am investigating the possibility of contr...","May4,2024"
2,Three Phase Sine Wave Power Supply using CIPOS...,"Hi all,I am an Embedded Engineer and I am look...","May4,2024"
3,hello nice to see you all today eggsoil\n\n\n ...,hello nice to see you all today eggsoil\nhttps...,"May4,2024"
4,Encountering Issues with Programming and Debug...,Translated Content:\nBoard Model: Psoc6-evalua...,"May4,2024"
5,使用PSoc 62系列板卡时遇到了无法烧录和调试的问题，似乎是flash的问题\n\n\n ...,我的板卡型号是Psoc6-evaluationkit-062S2，在我按下板卡上的MODE按...,"May4,2024"
6,Question of Angel PLL observer (MOTIX FOC),"Hi all, I met some questions when I worked wit...","May3,2024"
7,我有一些问题请教你,"Hello,I have a CYUSBKIT-003 board and I am stu...","May3,2024"
8,Unlock the thread\n\n\n Solved,May I have my recent thread unlocked. I need s...,"May3,2024"
9,PSoC6 CY8CPROTO-063,"Hello, good afternoon, I am going to work on a...","May3,2024"


In [28]:
options = soup.find_all('li', class_='options')

print(f'Number of Types: {len(options)} \n')

print(f'Type: \n')
for option in options:
    print(option.text.strip())

Number of Types: 72 

Type: 

PSoC™ 6
Wi-Fi Combo
Nor Flash
USB low-full-high speed peripherals
FIRST Robotics Competition (FRC)
MOSFET (Si/SiC)
Adapters and Chargers
RF Transistors
OPTIGA™ Trust
CIRRENT™ Product Analytics
AIROC™ Wi-Fi and Wi-Fi Bluetooth Combos
CO₂ sensor
Community Information
Studio Bluetooth
Clocks
Hyper Flash
Robotics
USB hosts hubs transceivers
OPTIGA™ TPM
3D Hall (Magnetic sensor)
CIRRENT™ Cloud ID
AIROC™ Wi-Fi MCUs
IGBT
PSoC™ 4
Smart Bluetooth
Other Technologies General
Hyper RAM
USB superspeed peripherals
ModusToolbox™
GaN
SECORA™ Blockchain
Antenna tuners
Switches (Magnetic sensors)
PSoC™ 5, 3 & 1
AIROC™ Bluetooth
Non Volatile RAM (F-RAM & NVSRAM)
USB EZ-PD™ Type-C
CIPURSE™
Low Noise Amplifiers
Diodes & Thyristors (Si/SiC)
Radar sensor
DAVE™
PSoC™ Creator & Designer
Specialty Memory
OPTIGA™ Connect
CAPSENSE™ & MagSense
RF Switches
Smart Power Switches
Angle (Magnetic sensor)
Code Examples
Wi-Fi Bluetooth for Linux
SRAM
Current (Magnetic sensor)
Gate Driver ICs

# Read Input Data

According to different product, we have different input data. To simplify, we use "Instant Data Scraper" chrome extension to do data scraping from the community website. (https://chromewebstore.google.com/detail/instant-data-scraper/ofaokhiedipichpaobibbnahnkdoiiah)

In [56]:
df1 = pd.read_csv('data/PSoC6.csv')
df2 = pd.read_csv('data/Wi-Fi Combo.csv')
df3 = pd.read_csv('data/Nor Flash.csv')
df4 = pd.read_csv('data/USB low-full-high speed peripherals.csv')

ParserError: Error tokenizing data. C error: Expected 5 fields in line 3, saw 13


In [48]:
df1.loc[:,['subject-link','truncated-body']] #PSoC6

Unnamed: 0,subject-link,truncated-body
0,使用PSoc 62系列板卡时遇到了无法烧录和调试的问题，似乎是flash的问题,我的板卡型号是Psoc6-evaluationkit-062S2，在我按下板卡上的MODE按...
1,PSoC6 CY8CPROTO-063,"Hello, good afternoon, I am going to work on a..."
2,Encountering Issues with Programming and Debug...,Translated Content:\nBoard Model: Psoc6-evalua...
3,SCB Managing Slave Select Peripheral Lines,- Device Configurator 3.10.0.6117\n- 7e6892ee1...
4,How to properly use dma and spi together with ...,We are having a problem communicating with two...
...,...,...
3336,Datasheet,Preliminary datasheet available yet ?
3337,what about the features of PSOC 6?,It is glad to see PSOC6 released on the offici...
3338,Does PSOC 6 support MESH network?,The video on the official website what I see A...
3339,Cortex M4 FPU?,I have always associated the Cortex-M4 with a ...


In [49]:
df2.loc[:,['subject-link','truncated-body']] #Wi-Fi Combo

Unnamed: 0,subject-link,truncated-body
0,Malloc is thread safe ?,"Hello,\nI'm using WICED STUDIO 6.6. and CYW943..."
1,CYW943907 - Amazon FreeRTOS OTA support,"Hi, Per Amazon doc, CYW943907AEVAL1F is not su..."
2,how to enable or disable the save restore feat...,when I read the register of CHIPCOMMON_SR_CONT...
3,HTTPS speed problem,Good afternoon. I am facing HTTPS speed issue ...
4,Disable dns server.,"Hi, I have a Laird Sterling EWB. What I try to..."
...,...,...
3345,SPI port configuration for SFLASH on other tha...,"[WICED-SDK-2.2.1, MCU: STM32F205]From what I s..."
3346,How do I port a WICED-SDK-1.x app to WICED-SDK...,"I have an app that works with WICED_SDK-1.3, h..."
3347,How do I avoid building the bootloader?,What is the correct way to build an applicatio...
3348,Welcome &amp; Forum Usage Notes,Welcome to the WICED Forum!We hope you find th...


In [50]:
df3.loc[:,['subject-link','truncated-body']] #Nor Flash

Unnamed: 0,subject-link,truncated-body
0,how to use or unprotect the highest address se...,"Hi,\nIam using a S70GL02GT11FHA010 NOR Flash...."
1,S25FL256LAGNFM010 material specifications,I would like to know what the potting compound...
2,S29AL016J70TFN020 Thermal Data?,"Hello,\nWe would like to use this memory: S29A..."
3,Where is the file ---- slld_fll_256l.h,A member from Infineon posted me a zip file re...
4,S29JL032J60TFI010 of Product Status,How is the production of S29JL032J60TFI010 pro...
...,...,...
1840,关于fx3芯片的寄存器,在ez-usb fx3芯片的technical reference manual中的chap...
1841,igbt module,"Good afternoon, would it be possible to replac..."
1842,ゲートドライブ回路の推奨設計について,DCDCの設計を行っているのですが、車のレイアウトの関係で基板が２相でまたがってしまいます。...
1843,MOTIX 6EDL7141 issues,I'm currently working on motor driver wit usag...


In [53]:
df4.loc[:,['subject-link','truncated-body']] #USB low-full-high speed peripherals

Unnamed: 0,subject-link,truncated-body
0,"CY7C65215, configuration image CRC","From a linux system, I want to read back the c..."
1,Arming of bulk/interrupt out endpoint in fx2lp,"Hello,I am stuck at a point,in my application ..."
2,Pinout error on documentation of CY7C6514D,"Hi Infineon Community,\nI am reviewing the sec..."
3,CY7C65210 TID Number,"Hi,\nWhat is CY7C65210 TID number?\nBR\nEason"
4,Setting up isochronous out endpoint in fx2lp,"Hello,Can someone point me to examples codes f..."
...,...,...
2729,external RAM Selection for TC387,"We are looking for external RAM (around 2MB, p..."
2730,Spice model incompatibility IMZA120R014M1H_L3,"Dear Infineon team,the spice model of the IMZA..."
2731,secure boot using slb9673,"Hi team, \nwe are using the slb9673 tpm2 chip ..."
2732,"(SDK 1.3.5) No ""CyU3PDmaMultiChannelGetBuffer(...","Dear Support\n \nAs KBA231382, if all buffers ..."
