<a href="https://colab.research.google.com/github/MarMarhoun/freelance_work/blob/main/side_projects/NLP_projs/eda_streamlit/resume_parser.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project Description:

To create a web app to deploy a resume parser model using Streamlit, you need to have the following:

A trained resume parser model (e.g., using Named Entity Recognition or Information Extraction techniques)
Streamlit library installed in your Python environment.

Here's a sample Streamlit app code for deploying a resume parser model:


In [None]:
# Import necessary libraries
import streamlit as st
import pandas as pd
from transformers import pipeline

# Initialize the model
model = pipeline('feature-extraction', model='cardiffnlp/twitter-roberta-base-sentence-embedding')

def parse_resume(uploaded_file):
    # Load the resume file
    text = uploaded_file.read().decode('utf-8')

    # Extract features using the model
    features = model(text.split('.'))[0]

    # Create a DataFrame to store the extracted features
    df = pd.DataFrame(features, columns=[
        'feature_' + str(i) for i in range(len(features))
    ])

    return df

# Title of the app
st.title('Resume Parser Web App')

# Upload a file
uploaded_file = st.file_uploader("Upload a Resume (.txt or .pdf)", type=["txt", "pdf"])

if uploaded_file is not None:
    # Parse the resume
    parsed_resume = parse_resume(uploaded_file)

    # Display the parsed resume
    st.subheader('Parsed Resume Features:')
    st.write(parsed_resume)

This code initializes a Streamlit app that allows users to upload a resume file (either .txt or .pdf format) and then parses the resume using a pre-trained feature extraction model. The extracted features are displayed in a table format.

Please note that this is just a basic example. You can customize the app to include more functionalities, such as saving the parsed results, adding a user interface, or integrating with other services.

The following code adds enhancements to the previous example:

The function load_model() initializes the feature extraction model.

The function load_encoder() initializes a LabelEncoder object to encode the class labels.

The function parse_resume() uses the feature extraction model and the LabelEncoder to parse the resume.

The if uploaded_file is not None: block calls the parse_resume() function and displays the parsed resume as a table and as a JSON object.
?
By splitting the code into separate functions, we improve its readability and maintainability.

In [None]:
import streamlit as st
import pandas as pd
from transformers import pipeline
from sklearn.preprocessing import LabelEncoder

def load_model():
    model = pipeline('feature-extraction', model='cardiffnlp/twitter-roberta-base-sentence-embedding')
    return model

def load_encoder():
    encoder = LabelEncoder()
    encoder.classes_ = ['Paragraph', 'Section', 'Bullet Point', 'Table']
    return encoder

def parse_resume(uploaded_file, model, encoder):
    # Load the resume file
    text = uploaded_file.read().decode('utf-8')

    # Extract features using the model
    features = model(text.split('.'))[0]

    # Create a DataFrame to store the extracted features
    df = pd.DataFrame(features, columns=[
        'feature_' + str(i) for i in range(len(features))
    ])

    # Encode the class labels
    df['label'] = encoder.transform(df['label'].tolist())

    return df

st.title('Resume Parser Web App')
uploaded_file = st.file_uploader("Upload a Resume (.txt or .pdf)", type=["txt", "pdf"])

if uploaded_file is not None:
    model = load_model()
    encoder = load_encoder()

    parsed_resume = parse_resume(uploaded_file, model, encoder)

    st.subheader('Parsed Resume Features:')
    st.write(parsed_resume)

    st.subheader('Parsed Resume as JSON:')
    st.write(parsed_resume.to_json(orient='records'))