# Archaeological Data Analysis: Individual Mod

### Author:  Sean Sullivan

## Download delimited-text data

We'll make the standard Scala `Source` object available by `import`ing it, then use it to retrieve the content of a URL.

In [11]:
import scala.io.Source
val lyrianCex = "http://shot.holycross.edu/courses/ada/S20/data/lycianNecropoleis.cex"

[32mimport [39m[36mscala.io.Source
[39m
[36mlyrianCex[39m: [32mString[39m = [32m"http://shot.holycross.edu/courses/ada/S20/data/lycianNecropoleis.cex"[39m

We'll extract a sequence of lines from the URL source, and convert them to our favorite type of Scala collection, a `Vector`.

(The following cell downloads the data:  depending on your internet connection, this might take a moment.)

In [12]:
val lines = Source.fromURL(lyrianCex).getLines.toVector

[36mlines[39m: [32mVector[39m[[32mString[39m] = [33mVector[39m(
  [32m"sitename#ztype#tombcount#comments#ztypetext#rageid"[39m,
  [32m"Antiphellos#IID#17#NA#House tomb, rock-cut facade#1667"[39m,
  [32m"Pinara#IIIA#2#NA#Lycian sarcophagus, free standing#1696"[39m,
  [32m"Delicedere#IIIA#1#NA#Lycian sarcophagus, free standing#NA"[39m,
  [32m"Xanthos#IIIA#9#NA#Lycian sarcophagus, free standing#1694"[39m,
  [32m"Tlos#IIIA#5#NA#Lycian sarcophagus, free standing#1695"[39m,
  [32m"Telmessos#IIIA#1#NA#Lycian sarcophagus, free standing#1571"[39m,
  [32m"Trysa#IIIA#4#NA#Lycian sarcophagus, free standing#1666"[39m,
  [32m"Tuze#IIIA#3#NA#Lycian sarcophagus, free standing#1756"[39m,
  [32m"Cindam#IIIA#1#NA#Lycian sarcophagus, free standing#1755"[39m,
  [32m"Bayindir Liman#IIIA#1#NA#Lycian sarcophagus, free standing#1724"[39m,
  [32m"Sura#IIIA#1#NA#Lycian sarcophagus, free standing#1702"[39m,
  [32m"Limyra#IIIA#5#NA#Lycian sarcophagus, free standing#1701"[39m,
  [

## Finding the numeric values of the Tombs

Having a vector of Strings, we split each String by the # character, creating new Vectors. This creates a Vector of Vectors. Each line of the data is mapped to a Vector of strings.

In [13]:
val data = lines.tail.map(ln => ln.split("#").toVector)

[36mdata[39m: [32mVector[39m[[32mVector[39m[[32mString[39m]] = [33mVector[39m(
  [33mVector[39m(
    [32m"Antiphellos"[39m,
    [32m"IID"[39m,
    [32m"17"[39m,
    [32m"NA"[39m,
    [32m"House tomb, rock-cut facade"[39m,
    [32m"1667"[39m
  ),
  [33mVector[39m(
    [32m"Pinara"[39m,
    [32m"IIIA"[39m,
    [32m"2"[39m,
    [32m"NA"[39m,
    [32m"Lycian sarcophagus, free standing"[39m,
    [32m"1696"[39m
  ),
  [33mVector[39m(
    [32m"Delicedere"[39m,
    [32m"IIIA"[39m,
    [32m"1"[39m,
    [32m"NA"[39m,
    [32m"Lycian sarcophagus, free standing"[39m,
    [32m"NA"[39m
  ),
  [33mVector[39m(
    [32m"Xanthos"[39m,
    [32m"IIIA"[39m,
    [32m"9"[39m,
    [32m"NA"[39m,
    [32m"Lycian sarcophagus, free standing"[39m,
    [32m"1694"[39m
  ),
  [33mVector[39m([32m"Tlos"[39m, [32m"IIIA"[39m, [32m"5"[39m, [32m"NA"[39m, [32m"Lycian sarcophagus, free standing"[39m, [32m"1695"[39m),
  [33mVector[39m(
    [32m

The tombCount is in the thrid column(index 2) of each line of data. Now we create a new Vector that will only contain the tombCount.

In [14]:
val tombs = data.map(columns => columns(2))

[36mtombs[39m: [32mVector[39m[[32mString[39m] = [33mVector[39m(
  [32m"17"[39m,
  [32m"2"[39m,
  [32m"1"[39m,
  [32m"9"[39m,
  [32m"5"[39m,
  [32m"1"[39m,
  [32m"4"[39m,
  [32m"3"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"5"[39m,
  [32m"2"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"2"[39m,
  [32m"3"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"1"[39m,
  [32m"6"[39m,
  [32m"13"[39m,
  [32m"2"[39m,
  [32m"17"[39m,
  [32m"2"[39m,
  [32m"8"[39m,
...

We need to be able to add up this values to find the total count of tombs, but this is a Vector of Strings. We can convert by using the Strings toInt method.

In [15]:
val tombCounts = tombs.map(s => s.toInt)

[36mtombCounts[39m: [32mVector[39m[[32mInt[39m] = [33mVector[39m(
  [32m17[39m,
  [32m2[39m,
  [32m1[39m,
  [32m9[39m,
  [32m5[39m,
  [32m1[39m,
  [32m4[39m,
  [32m3[39m,
  [32m1[39m,
  [32m1[39m,
  [32m1[39m,
  [32m5[39m,
  [32m2[39m,
  [32m1[39m,
  [32m1[39m,
  [32m1[39m,
  [32m1[39m,
  [32m1[39m,
  [32m1[39m,
  [32m1[39m,
  [32m2[39m,
  [32m3[39m,
  [32m1[39m,
  [32m1[39m,
  [32m1[39m,
  [32m1[39m,
  [32m1[39m,
  [32m1[39m,
  [32m1[39m,
  [32m1[39m,
  [32m1[39m,
  [32m1[39m,
  [32m6[39m,
  [32m13[39m,
  [32m2[39m,
  [32m17[39m,
  [32m2[39m,
  [32m8[39m,
...

Using the sum method we can easily sum the Vector and it's integer values.

In [16]:
tombCounts.sum

[36mres15[39m: [32mInt[39m = [32m1085[39m