<p>This jupyter notebooks contains the routines neccessary to build a URL shortening program from a Pandas dataframe using the pyshorteners package. The pyshorteners can uses a variety of URL shortener APIs and for my function I allow the user to define whether to use TinyURL or Bitly as the API of choice.</p> 

<p>I decided to make this program because ,as part of my job search process, I saved  information about the positions I was interedted in applying to in an excel spreadsheet. However, the links to the jobs (from LinkedIn searches) were too long to tidily contain within a one-line cell. Thus, I wrote this program to make it easier for me to sift through the info for the job postings later on.</p>

<p>This program requires the user to input the following:</p>
<p>--A .csv file containing at least one column containing URLs</p>

<p>This program will output the following:</p>
<p>--A .csv file containing a new column with the shortened URLs</p>

<p>This notebook will also show some data visualization based on my collected info for these jobs.</p>

In [2]:
#Import the required packages
import pyshorteners
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

In [14]:
#Step 1: Import the .csv file
df = pd.read_csv('job data.csv')
df.head()

Unnamed: 0,Job Description,Company,Link to Job,Location,Remote,Company Size,Type,Posted On,Expected Salary
0,"Data Scientist, Spotify For Artists",Spotify,https://www.linkedin.com/jobs/view/3189580391/...,United States,YES,10000,Full-time,8/8/2022,
1,"Data Scientist, Amazon Pharmacy",Amazon,https://www.linkedin.com/jobs/view/3086068882/...,United States,YES,10000,Full-time,7/25/2022,206000.0
2,Data Scientist,Nielsen,https://www.linkedin.com/jobs/view/3198666818/...,United States,YES,10000,Full-time,8/5/2022,
3,Data Scientist - Provider Interaction,CVS Health,https://www.linkedin.com/jobs/view/3196193466/...,"New York, NY",YES,10000,Full-time,8/3/2022,128700.0
4,Data Scientist - Remote,Bayer,https://www.linkedin.com/jobs/view/3200875588/...,"Creve Coeur, MO",YES,10000,Full-time,8/2/2022,


In [15]:
#Step 2: Make the url shortener function
keyToAPI = 'bfcee43e9b447c96012d8e2b457e4040ebb18ba1'
def urlShorten(long_url, whichAPI = 'TinyURL', keyToAPI = ""):
    """
    Args:
        longURL:    (str) Long URL that is to be shortened
        whichAPI:   (str) [Optional] Which API to use. Must be either 'TinyURL' or 'Bitly'. TinyURL is the default 
        keyToAPI:   (str) [Optional] Key to API. Only needed if not using TinyURL as URL shortener
    Returns:
        short_url:   (str) shortened URL
    """   
    listOfAPIs = ['TinyURL', 'Bitly']
    
    if whichAPI == 'Bitly':
        type_bitly = pyshorteners.Shortener(api_key=keyToAPI)
        short_url  = type_bitly.bitly.short(long_url)
    elif whichAPI == 'TinyURL':
        type_tiny = pyshorteners.Shortener()
        short_url = type_tiny.tinyurl.short(long_url)
    
    return short_url

In [17]:
#Step 3 apply the url shortening function to the link column
colWithLongURL = 'Link to Job' #Change this to the column name containing the long URL you want shortened!
df['Short Link'] = df[colWithLongURL].apply(urlShorten)
df.head()

Unnamed: 0,Job Description,Company,Link to Job,Location,Remote,Company Size,Type,Posted On,Expected Salary,Short Link
0,"Data Scientist, Spotify For Artists",Spotify,https://www.linkedin.com/jobs/view/3189580391/...,United States,YES,10000,Full-time,8/8/2022,,https://tinyurl.com/2le59b7b
1,"Data Scientist, Amazon Pharmacy",Amazon,https://www.linkedin.com/jobs/view/3086068882/...,United States,YES,10000,Full-time,7/25/2022,206000.0,https://tinyurl.com/2gs8hsnm
2,Data Scientist,Nielsen,https://www.linkedin.com/jobs/view/3198666818/...,United States,YES,10000,Full-time,8/5/2022,,https://tinyurl.com/2j4ndjwx
3,Data Scientist - Provider Interaction,CVS Health,https://www.linkedin.com/jobs/view/3196193466/...,"New York, NY",YES,10000,Full-time,8/3/2022,128700.0,https://tinyurl.com/2n9tes25
4,Data Scientist - Remote,Bayer,https://www.linkedin.com/jobs/view/3200875588/...,"Creve Coeur, MO",YES,10000,Full-time,8/2/2022,,https://tinyurl.com/2ezqm4fb


In [22]:
#Step 4: Drop the LongURL column before outputting the .csv file. This is ignored if the column being
#called does not exist in the dataframe
df = df.drop(colWithLongURL, axis=1, errors='ignore')
df.head()

Unnamed: 0,Job Description,Company,Location,Remote,Company Size,Type,Posted On,Expected Salary,Short Link
0,"Data Scientist, Spotify For Artists",Spotify,United States,YES,10000,Full-time,8/8/2022,,https://tinyurl.com/2le59b7b
1,"Data Scientist, Amazon Pharmacy",Amazon,United States,YES,10000,Full-time,7/25/2022,206000.0,https://tinyurl.com/2gs8hsnm
2,Data Scientist,Nielsen,United States,YES,10000,Full-time,8/5/2022,,https://tinyurl.com/2j4ndjwx
3,Data Scientist - Provider Interaction,CVS Health,"New York, NY",YES,10000,Full-time,8/3/2022,128700.0,https://tinyurl.com/2n9tes25
4,Data Scientist - Remote,Bayer,"Creve Coeur, MO",YES,10000,Full-time,8/2/2022,,https://tinyurl.com/2ezqm4fb


In [21]:
#Step 5 : Output the csv file without indexes
df.to_csv('out.csv', index=False)