## [Problem 22](https://projecteuler.net/problem=22): Names Scores

Using names.txt (right click and 'Save Link/Target As...'), a 46K text file containing over five-thousand first names, begin by sorting it into alphabetical order. Then working out the alphabetical value for each name, multiply this value by its alphabetical position in the list to obtain a name score.

For example, when the list is sorted into alphabetical order, COLIN, which is worth 3 + 15 + 12 + 9 + 14 = 53, is the 938th name in the list. So, COLIN would obtain a score of 938 × 53 = 49714.

What is the total of all the name scores in the file?

In [1]:
import pandas as pd

Read the names in as a pandas series:

In [2]:
names = pd.read_csv('names.txt', header=None, sep=',').squeeze("rows")

print(names.info())
print(names.head())
print(names.tail())

<class 'pandas.core.series.Series'>
Int64Index: 5163 entries, 0 to 5162
Series name: 0
Non-Null Count  Dtype 
--------------  ----- 
5162 non-null   object
dtypes: object(1)
memory usage: 80.7+ KB
None
0         MARY
1     PATRICIA
2        LINDA
3      BARBARA
4    ELIZABETH
Name: 0, dtype: object
5158        ELDEN
5159       DORSEY
5160       DARELL
5161    BRODERICK
5162       ALONSO
Name: 0, dtype: object


There appears to be a null value. I will explore this further:

In [3]:
print(names[names.isna()])

3302    NaN
Name: 0, dtype: object


In [4]:
print(names.loc[3301:3303])

3301      OLENE
3302        NaN
3303    MERRILL
Name: 0, dtype: object


Upon inspection, the null value actually is a name - the name "NA". I will replace the null with the correct name.

In [5]:
names.loc[3302] = "NA"

Now I will sort the names in alphabetical order.

In [6]:
names.sort_values(inplace=True, ignore_index=True)

print(names.info())
print(names.head())
print(names.tail())

<class 'pandas.core.series.Series'>
RangeIndex: 5163 entries, 0 to 5162
Series name: 0
Non-Null Count  Dtype 
--------------  ----- 
5163 non-null   object
dtypes: object(1)
memory usage: 40.5+ KB
None
0    AARON
1    ABBEY
2    ABBIE
3     ABBY
4    ABDUL
Name: 0, dtype: object
5158       ZORA
5159    ZORAIDA
5160       ZULA
5161     ZULEMA
5162      ZULMA
Name: 0, dtype: object


Next I will write a function that will calculate the alphabetical score of a given name:

In [7]:
letter_values = {'A': 1,
                 'B': 2,
                 'C': 3,
                 'D': 4,
                 'E': 5,
                 'F': 6,
                 'G': 7,
                 'H': 8,
                 'I': 9,
                 'J': 10,
                 'K': 11,
                 'L': 12,
                 'M': 13,
                 'N': 14,
                 'O': 15,
                 'P': 16,
                 'Q': 17,
                 'R': 18,
                 'S': 19,
                 'T': 20,
                 'U': 21,
                 'V': 22,
                 'W': 23,
                 'X': 24,
                 'Y': 25,
                 'Z': 26}


def alpha_score(name):
    '''Compute the alphabetical score of a name, based on the letters in the name'''
    score = 0
    for letter in name:
        score += letter_values[letter.upper()]
    return score

print(alpha_score('JEFF'))
print(alpha_score('COLIN'))
print(alpha_score('stevenson'))

27
53
133


Now I will create a dataframe from the names:

In [8]:
names_df = pd.DataFrame(names)
names_df.columns = ['name']
names_df

Unnamed: 0,name
0,AARON
1,ABBEY
2,ABBIE
3,ABBY
4,ABDUL
...,...
5158,ZORA
5159,ZORAIDA
5160,ZULA
5161,ZULEMA


Next, compute the alphabetical score of each name:

In [9]:
names_df['alpha_score'] = names_df['name'].apply(alpha_score)
names_df

Unnamed: 0,name,alpha_score
0,AARON,49
1,ABBEY,35
2,ABBIE,19
3,ABBY,30
4,ABDUL,40
...,...,...
5158,ZORA,60
5159,ZORAIDA,74
5160,ZULA,60
5161,ZULEMA,78


Next, I can use the index of the dataframe and the alphabetical score to calculate the name score for each name:

In [10]:
names_df['name_score'] = (names_df.index + 1) * names_df['alpha_score']
print(names_df[names_df['name'] == 'COLIN'])
names_df

      name  alpha_score  name_score
937  COLIN           53       49714


Unnamed: 0,name,alpha_score,name_score
0,AARON,49,49
1,ABBEY,35,70
2,ABBIE,19,57
3,ABBY,30,120
4,ABDUL,40,200
...,...,...,...
5158,ZORA,60,309540
5159,ZORAIDA,74,381840
5160,ZULA,60,309660
5161,ZULEMA,78,402636


Finally, sum up all the name scores!

In [11]:
names_df['name_score'].sum()

871198282