<a href="https://colab.research.google.com/github/Harsh-Patel25/Python/blob/main/daily_lessons/Day_12_Pandas_Working_With_HTML_%26_XML.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ✅ **Tutorial 12: Python Pandas - Working With HTML & XML 📄🌐 (Part 4)**

---

### 📌 **1. Introduction – HTML aur XML Data Pandas ke saath kaise use karte hain?**

Real-world web pages aur XML APIs se data fetch karna ek powerful skill hai 📡  
Is lesson me hum seekhenge:

- `read_html()` se HTML tables read karna
- `to_html()` se DataFrame ko HTML file me convert karna
- `read_xml()` se XML data ko Pandas DataFrame me laana
- `to_xml()` se DataFrame ko XML format me export karna  

Use-case:
- Web scraping 📊  
- HTML/XML APIs se data lena 🛠️  
- Wikipedia jaise sources se data analysis karna 📚  

---

### 🧪 **2. Basic: `read_html()` Example – Wikipedia se Table Read karna**

```python
import pandas as pd

html = pd.read_html("https://en.wikipedia.org/wiki/Mobile_country_code")
print(type(html))
```

📌 Output:
```python
<class 'list'>
```

Yeh `read_html()` function ek list of tables return karta hai. Agar page pe multiple tables hain to unhe list me index wise access kar sakte ho.

```python
html[1]
```

📊 **Output Table (Simplified):**

| Mobile country code | Country | ISO 3166 | Mobile network codes | Remarks |
|----------------------|---------|-----------|-----------------------|----------|
| 289 | Abkhazia | GE-AB | List... | MCC not listed by ITU |
| 412 | Afghanistan | AF | List... | NaN |
| 276 | Albania | AL | List... | NaN |
| ... | ... | ... | ... | ... |

---

### 🔄 **3. `to_html()` – Save DataFrame to HTML File 📝**

```python
html[0].to_html('demo.html')
```

📌 Ye line DataFrame ko ek HTML file ke form me save kar degi.

> 🎯 **Real-World Use**: Reports banane me HTML export kaafi kaam aata hai.

---

### 🌍 **4. Multiple Tables – `match` parameter use karna**

```python
data = pd.read_html("https://en.wikipedia.org/wiki/Economy_of_the_United_States", match="Government debt")
data[0]
```

📊 **Government Debt Table (Simplified)**:

| Year | GDP | Inflation | Unemployment | Government Debt (%) |
|------|-----|-----------|--------------|----------------------|
| 2001 | 10581.9 | 2.8% | 4.7% | 53.1% |
| 2008 | 14769.9 | 3.8% | 5.8% | 73.4% |
| 2020 | 20893.8 | 1.2% | 8.1% | 133.9% |

✅ `match="..."` use karne se hum specific table target kar sakte hain.

---

### 🧊 **5. Temperature Table Example: Matching with Keywords**

```python
temp_data = pd.read_html("https://en.wikipedia.org/wiki/Minnesota", match="Average daily maximum and minimum temperatures")
temp_data[0]
```

📊 **Temperature Data**:

| Location | July (°F) | Jan (°F) |
|----------|-----------|----------|
| Minneapolis | 83/64 | 23/7 |
| Duluth | 76/55 | 19/1 |

---

### 🧬 **6. XML Basics – XML kya hota hai?**

📌 **XML (eXtensible Markup Language)**:
- Structured data store karne ke liye
- Self-descriptive hota hai
- APIs aur config files me kaafi common

---

### 📂 **7. `read_xml()` – XML Data ko DataFrame me Convert karna**

```python
xml = '''<?xml version='1.0' encoding='utf-8'?>
<data>
 <row>
   <shape>square</shape>
   <degrees>360</degrees>
   <sides>4.0</sides>
 </row>
 <row>
   <shape>circle</shape>
   <degrees>360</degrees>
   <sides/>
 </row>
</data>'''

df = pd.read_xml(xml)
print(df)
```

📌 **Output**:

| shape | degrees | sides |
|-------|---------|-------|
| square | 360 | 4.0 |
| circle | 360 | NaN |

---

### 📤 **8. Attributes XML – Advanced Style**

```python
xml = '''<data>
  <row shape="square" degrees="360" sides="4.0" firstname="Krish"/>
  <row shape="circle" degrees="360"/>
</data>'''

df = pd.read_xml(xml, xpath=".//row")
```

📌 Output me XML attributes as columns detect ho jaate hain.

---

### 🧭 **9. Namespaced XML – Real World XML Example**

```python
xml = '''<doc:data xmlns:doc="https://example.com">
  <doc:row>
    <doc:shape>square</doc:shape>
    <doc:degrees>360</doc:degrees>
    <doc:sides>4.0</doc:sides>
  </doc:row>
</doc:data>'''

df = pd.read_xml(xml, xpath=".//doc:row", namespaces={"doc": "https://example.com"})
df.to_xml('test1.xml')
```

✅ **Namespaces** ka use aata hai jab XML tags prefixed hote hain (real-world APIs me common).

---

### 📌 **10. Summary – Key Points 🚀**

| Topic | Function | Use |
|-------|----------|-----|
| HTML read | `pd.read_html()` | Wikipedia, reports se data read |
| HTML export | `df.to_html()` | DataFrame to HTML file |
| XML read | `pd.read_xml()` | XML string/file se data |
| XML export | `df.to_xml()` | DataFrame to XML format |

---

### 🔥 Bonus Real-World Use-Cases:

- Wikipedia se GDP, Population ya Sports Stats scrape karna
- Government ke open-data XML se transport ya weather data lena
- HTML tables ko Excel/CSV banane ke liye parse karna😎

---