# Ancient Lycia Necropoleis Type Analysis
## Author: Michael Dahlquist, Kristen Qako, & Sean Sullivan

In ancient Lycia, rock-cut tombs often clustered together in cemetery sites, or necropoleis, like ancient Myra (modern Demre), illustrated above. This data set contains data about Lycian necropoleis including the number of tombs at each site. You will figure out how many total tombs are represented in the data set. The records describe for each site its name, a typological classification by the Danish scholar Jan Zahle, the number of tombs, an English-language summary, and an identifier in a geographic data set.

## Purpose of Notebook:
The purpose of this jupyter notebook is to discover how many of each ztype was found in the tomb data. We then export the data to an excel file and created a chart to display our results.

### The Data Set

The dataset is available as a delimited-text file [here](https://raw.githubusercontent.com/michaeldahlquist/clas299/master/ancient-lycia-tombs/lycianNecropoleis.cex). The format is one record per row, and columns are delimited by the pound sign (hash tag) `#`. The file includes a header row:

`sitename#ztype#tombcount#comments#ztypetext#rageid`

The records describe for each site its name, a typological classification by the Danish scholar Jan Zahle, the number of tombs, an English-language summary, and an identifier in a geographic data set.

### Retrieve the data set

To download the data set, you can use the Scala `Source` object. We need to import its class:

`import scala.io.Source`

In [None]:
import scala.io.Source
val lycianNecropoleis = "https://raw.githubusercontent.com/michaeldahlquist/clas299/master/ancient-lycia-tombs/lycianNecropoleis.cex"

We then extract a sequence of lines from the URL source, and convert them to a vector.

In [None]:
val lines = Source.fromURL(lycianNecropoleis).getLines.toVector

### Extract the numeric count of tombs

You should now have a Vector of Strings. You want to split up each String on the `#` character, to create a new Vector – this time, a Vector of Vectors. You’ll be mapping each line of the source data to a Vector of strings, one per column.

We also will take the `tail` of the vector as the header is not part of the data.

In [None]:
val data = lines.tail.map(ln => ln.split("#").toVector)

In [None]:
val tombCounts = data.map(columns => columns(2).toInt)

### Create a new Class

Here we created a new class to filter the data later on in the code. This class makes the data easier to work with and organize.

In [None]:
case class LycianTomb (
  site: String,
  ztype: String,
  count: Int,
  comments: String,
  typeDescription: String
)

Here we filter the lines of data into a vector of classes that we created in the previous cell.

In [None]:
val tombs = data.filter(v => v.size >= 5).map ( cols => LycianTomb(
  cols(0),cols(1),cols(2).toInt, cols(3), cols(4)
))

### Group by zType:

Since all of the data is in a vector. We can group the data by using the .groupBy keyword.

In [None]:
val byType = tombs.groupBy( t => t.typeDescription)

This line of code creates a set of the unique zTypes found in the data.

In [None]:
val keys = byType.keySet

Finally, this line of code calculates the amount of each zType that was found in the data set.

In [None]:
val result = for (key <- keys) yield {
  key -> byType(key).map(tomb => tomb.count).sum
}

### View data via an excel file:

![Graph](https://github.com/michaeldahlquist/clas299/raw/master/lycia-tombs-final-project/graph.png)

## Further research that can be done

You can see the results of our hypthosis above, however, if anyone wanted to continue working with this data set they can ask the question: Why is the majority of the tombs founds to be of the type "House tomb, rock-cut facade." This can lead to further investigation of the data.