Fetching latest commit…
Cannot retrieve the latest commit at this time
|Failed to load latest commit information.|
This is a program for connecting to the CIA World Factbook in order to answer a series of questions my COMS 1007 instructor asked us to answer. The questions are: 1. List countries in South America that are prone to earthquakes. 2. Find the country with the lowest elevation point in Europe. 3. List all countries in the southeastern hemisphere. 4. List countries in Asia with more than 10 political parties. 5. Find all countries that have the color blue in their flag. 6. Find the top 5 countries with the highest electricity consumption per capita. (Electricity consumption % population) 7. A landlocked country is one that is entirely enclosed by land. For example, Austria is landlocked and shares its borders with Germany, Czech Republic, Hungary, etc. There are certain countries that are entirely landlocked by a single country. Find these countries. 8. I want to go on a vacation with a friend. Our goal is to visit as many capital cities as we can in as short a geographical distance as possible. To make things easier (and not worry about spherical geometry), we are fine with travelling to capitals that are within 10 degrees of latitude and longitude of each other. Find the lat/long coordinates and the list of countries/capitals so that the number of capitals is maximized. 9. Wild card – List countries who grant their citizens universal suffrage at age 20. 10. Wild card – List countries with unemployment rates below 5%. ---------------------------------------------------------------------------- What I first did was get the URLs and country names for every entry in the World Factbook. Then I made a list of blacklisted territories/dependencies to remove from the list of countries. When I have an arraylist of just countries that are legitimate countries, I opened up http requests to parse the entirety of their html pages into a string, which is then passed into a new Text object to hold the string. I can then access the html from Text objects instead of opening up http requests again, which increases the speed of the program. Question 1: I used regexes to find the continent name and parsed out the paragraph that describes the natural hazhard. If the continent name equals the query continent name, then the program looks through the natural hazhard description to look for the name of the natural hazhard (eg. earthquake). Question 2: I used a regex to first find the continent that matches the query continent, and then from there I parse the lowest elevation to a String, and converted it to a Double. Then I keep track of the current lowest elevation, replacing it if there is a lower elevation. The final lowest elevation is then printed. Question 3: I used a regex that finds the geographic coordinates line that lists the latitude and longitude. Then I check if the latitude is North or South depending on the query, and then the longitude for East or West depending on the query. Then the countries that satisfy the query are returned. Question 4: I first used a regex to identify if the current country is in the continent of the query continent. If so, then I went on to parse the paragraph that lists the political parties. Because they are separated by a semicolon, I count the number of semicolons a country's page has, and if it exceeds the indicated number of political parties asked by the user, the country is printed. Question 5: I used a regex to parse out the description of the current country's flag. From there, I know that for each color, there is a description about what it stands for, so there are always spaces around the color. I look for the color that the user is querying with spaces around it, and return all the countries that contain " blue ", for example. Question 6: For this question, I removed all unnecessary spaces from the full text of the HTML itself in order to make parsing using regexes easier. Because there are different electricity categories (consumption, production, etc.) it was a bit difficult to find a regex that worked for consumption. Then I parse out the electricity and the extension (trillion, billion, million) in order to have correct calculations. Parsing out population was less of an ordeal. I used another regex for that one and saved it to a String and converted it to a Double. I scaled everything down by 1000 when I was doing calcuations because a trillion is not a number that Java can hold in a double or an int. So scaling it was an easy way around that problem. After all of that parsing and converting, electricity consumption is divided by population. An arraylist containing country names and containing the calculations in order are used for reference and the arraylist containing the consumption per capita is sorted. Then the index of the highest 5, or whatever the user wants, countries are returned by looking through the reference arraylists. Question 7: For this one, I looked for the word "landlocked" first. Then if that was true, I looked at the border countries. The countries are separated by commas, so I counted the commas in the parsed section of HTML which I again got using a regex. If the number of commas equaled 1, then that meant that the country was landlocked within another country. Question 8: For this question, I parsed out the name of the capital and its coordinates for each country. The latitude and longitude were converted 0 - 180 and 0 - 360 respectively. Then they were parsed into Doubles and passed in as parameters for a Capital object, which were added to a capitals arraylist of Capitals. Then I created a 2D array that is one greater dimensionally than the size of the capitals arraylist (I also removed a country because it does not have an official capital - Nauru). The array is .size() + 1 because the first row and columns hold the information for each capital. Then the capitals are added to the array's top row and leftmost column. After that, I implemented an algorithm that calculates the difference between latitude and longitude of the two capitals in each cell of the "grid". If it comes out less than 5 for both latitude and longitude (because it uses a capital is the center point), the cell of the grid is marked as "Yes". Else it is marked as "No." After this, the the number of Yeses per row is counted and stored in an arraylist. Then I extract the highest number of yeses from the arraylist, find the index of the highest number of yeses, add 1 since the 2D array had to account for capital names and was thus one index larger than the number of capitals. With the row index of the largest number of Yeses figured out, I implemented a for loop to return the capital information that had a "Yes" checked in its place on the grid. Note: This is not the most efficient algorithm as it only calculates maximum number of capitals using one capital as the center point. Question 9: This was one of my wildcard questions, which was find the countries that grant universal suffrage at a certain age that the user could input. I used a regex to parse out the suffrage information, and checked if it contained the query age. Question 10: Another wildcard; find the countries that have an unemployment rate below a certain percentage. I used a regex to parse out the unemployment information, parsed the number into a double, and then checked if it was less than the query percentage.