### **Resume Parser**
**Goal:** Build an intelligent parser that reads a resume and extracts structured details like contact info, skills, and job history using text extraction and pattern recognition.

**Install Required Libraries**

In [2]:
!pip install spacy pandas pdfplumber
!python -m spacy download en_core_web_sm

Collecting pdfplumber
  Downloading pdfplumber-0.11.7-py3-none-any.whl.metadata (42 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.8/42.8 kB[0m [31m1.5 MB/s[0m eta [36m0:00:00[0m
Collecting pdfminer.six==20250506 (from pdfplumber)
  Downloading pdfminer_six-20250506-py3-none-any.whl.metadata (4.2 kB)
Collecting pypdfium2>=4.18.0 (from pdfplumber)
  Downloading pypdfium2-4.30.1-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (48 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m48.2/48.2 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
Downloading pdfplumber-0.11.7-py3-none-any.whl (60 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.0/60.0 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pdfminer_six-20250506-py3-none-any.whl (5.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.6/5.6 MB[0m [31m56.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pypdfium2-4.30.1-py3-non

**Extract Text from a Resume File (PDF or TXT)**

In [4]:
import pdfplumber

def extract_text_from_pdf(pdf_path):
    text = ''
    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            page_text = page.extract_text()
            if page_text:
                text += page_text + '\n'
    return text.strip()

# Example usage
resume_text = extract_text_from_pdf("resume.pdf")
print(resume_text[:500])  # Preview first 500 characters

Amr Khaled Mohamed
+201024282025 Cairo,Egypt
Amrkhaled.gm@gmail.com LinkedIn:
amr-khaleddd
GitHub:Amrok9
⋄
Objective
A highly motivated and dedicated Artificial Intelligence graduate seeking a challenging job or internship
opportunity to apply and expand my expertise in machine learning, natural language processing, computer
vision, and reinforcement learning, Passionate about contributing to academic research and education, Eager to
contribute toinnovativeprojects and collaborate witha dynamict


**Extract Name, Email, Phone Using Regex + SpaCy**

In [5]:
import re
import spacy

# Load English language model
nlp = spacy.load("en_core_web_sm")

def extract_contact_info(text):
    name = None
    # Extract email using regex
    email = re.findall(r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}", text)
    # Extract phone number using regex
    phone = re.findall(r"\+?\d[\d\s\-]{8,15}", text)

    # Use spaCy to detect named entities
    doc = nlp(text)
    for ent in doc.ents:
        if ent.label_ == "PERSON":
            name = ent.text
            break

    return {
        "name": name,
        "email": email[0] if email else None,
        "phone": phone[0] if phone else None
    }

**Extract Skills**

You can use a predefined skill set and match them against the resume text.

In [6]:
def extract_skills(text, skill_set=None):
    if skill_set is None:
        skill_set = {
            "python", "java", "sql", "excel", "aws", "tensorflow",
            "pytorch", "c++", "javascript", "html", "css", "power bi",
            "docker", "linux", "git", "machine learning", "deep learning",
            "data analysis", "nlp", "flask", "django"
        }

    found_skills = []
    text_lower = text.lower()
    for skill in skill_set:
        if skill in text_lower:
            found_skills.append(skill)

    return list(set(found_skills))

**Extract Education and Experience Sections**

Use keyword-based extraction.

In [7]:
def extract_section(text, keyword):
  lines = text.lower().split('\n')
  section = []
  capture = False
  for line in lines:
      if keyword in line:
          capture = True
      elif capture and line.strip() == '':
          break
      elif capture:
          section.append(line.strip())
  return ' '.join(section)

**Combine All Parsers**

In [9]:
def parse_resume(text):
    contact_info = extract_contact_info(text)
    skills = extract_skills(text)
    education = extract_section(text, "education")
    experience = extract_section(text, "experience")
    return {
        **contact_info,
        "skills": skills,
        "education": education,
        "experience": experience
    }

# Parse the resume
parsed_data = parse_resume(resume_text)

# Print extracted data
for key, value in parsed_data.items():
    print(f"{key.upper()}: {value}\n")

NAME: Amr Khaled Mohamed

EMAIL: Amrkhaled.gm@gmail.com

PHONE: +201024282025 

SKILLS: ['pytorch', 'data analysis', 'git', 'machine learning', 'c++', 'aws', 'html', 'css', 'nlp', 'excel', 'tensorflow', 'deep learning', 'java', 'python']

EDUCATION: contribute toinnovativeprojects and collaborate witha dynamicteam todriveadvancements inai technologies. professional experience instructor,digital egypt cubsinitiative july-aug, 2024  guided students in mastering key topics such as problem-solving fundamentals, creative thinking, and the software development life cycle, significantly enhancing their skills and competencies. bachelors of artificial intelligence, egyptian russian university graduation date 2024 c-gpa: 3.23 training and professional certificates alx ai career essential may, 2024 awsacademy graduate -aws academy machinelearningfoundations feb, 2024 awsacademy graduate -aws academy cloudfoundation feb, 2024 advancedlearning algorithms (coursera) dec,2023 artificial intelligenc

### 📝 **Summary**

This project extracts structured information from resume PDFs using NLP and regex.

**Main Features Extracted:**
- 👤 Name (using spaCy's NER model)
- 📧 Email (regex)
- 📱 Phone number (regex)
- 🧠 Skills (matched from a predefined skill set)
- 🎓 Education section (keyword-based)
- 💼 Experience section (keyword-based)

**Libraries Used:**
- `pdfplumber`: To extract text from PDF
- `spaCy`: For name entity recognition (PERSON)
- `re`: For pattern matching (email, phone)
- `set()`: For matching known skills

**Steps:**
1. Read text from resume PDF
2. Extract contact info (name, email, phone)
3. Match known skills in the text
4. Extract sections like Education & Experience using keyword heuristics

✅ This project demonstrates how to build a basic resume parser using rule-based and NLP techniques, which can be enhanced further for real-world applications.
