Redundant code removal, CSV ID column name dependency removal, stress test support, and alternative color palette#244
Conversation
…and/or first column
…or plotting polygons
|
Removal of redundant code, mainly in
In the following commit, I restructured and renamed the class variable Some variables that were initially declared as In the |
|
Support for program to check for matching ID columns in CSV and GeoJSON, removing the need for order and name of ID column in the CSV Initially, the ID column in the CSV file needed to have the same name as the ID column in the GeoJSON file, even if the user provided the name of the CSV ID column in the arguments. In addition, if the user didn't provide the name of the ID column in the arguments, the first column in the CSV needed to be the ID column. Even if the ID column in the CSV file had the same name as the ID column in the GeoJSON file, if the column name was not provided and it isn't the first column, the program would not be able to match the columns. Now, the program is able to go through the columns in both the CSV and GeoJSON files to find the matching ID columns, removing the need to have the ID columns to have the same name as each other. This also removes the need for the ID column to be the first column in the CSV file if the user doesn't include a column name in the arguments. If the user does provide the name of the CSV ID column in the arguments, the program will check that column first and try to match it with the GeoJSON columns. |
|
(minor) Formatting of code to Clang format file A quick minor change to ensure all files in the project adhered to the standardized code structure provided in the Clang format file. |
|
Support for stress testing for new command line arguments and missing Included more tests for the Additional tests have been added as well, such as for CSV files that have no Note that |
|
Added alternative color palette if some but not all colors have been provided when plotting polygons Requested in #242, an alternative color palette is used if some but not all rows in a CSV file has been provided a color value when plotting polygons ( Based on my interpretation of the request, in the case where some but not all CSV rows have color values, the program will not change the color values of rows that already have them and will only apply the Dark color palette for the ones that don't. If this interpretation is incorrect and the program should replace all color values with the palette in the CSV, please let me know. |
There was a problem hiding this comment.
Pull Request Overview
This pull request refactors various parts of the code by removing redundant code and reformatting functions to eliminate dependencies on CSV ID column names while introducing support for new stress test command line arguments and an alternative color palette. Key changes include:
- Simplifying function declarations in header files with formatting updates.
- Removing the redundant set_id_header function and updating update_id_header_info to accept an optional parameter and return an int.
- Reordering and cleaning up include directives in the inset_state header.
Reviewed Changes
Copilot reviewed 24 out of 30 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| include/write_image.hpp | Reformatted function declaration for clarity. |
| include/parse_arguments.hpp | Simplified function declaration formatting. |
| include/inset_state.hpp | Adjusted include order and reformatted transform_polygons declaration. |
| include/cartogram_info.hpp | Revised property type and updated ID header-related functions. |
Files not reviewed (6)
- sample_data/algeria_by_wilaya_since_2019/algeria_population_by_wilaya_2022_no_colors.csv: Language not supported
- sample_data/austria_by_state_since_1945/austria_population_by_state_2020_output_equal_area_map.csv: Language not supported
- sample_data/belgium_by_region_since_1995/belgium_population_by_region_2022_plot_polygons.csv: Language not supported
- sample_data/brazil_by_state_since_1988/brazil_population_by_state_2021_no_inset.csv: Language not supported
- sample_data/croatia_by_county_since_1992/croatia_population_by_county_2021_export_preprocessed.csv: Language not supported
- sample_data/germany_by_state_since_1990/germany_population_by_state_2011_export_time_report.csv: Language not supported
| void project_to_equal_area(); | ||
|
|
||
| void update_id_header_info(const std::string &); | ||
| int update_id_header_info(const std::optional<std::string> &); |
There was a problem hiding this comment.
The update_id_header_info function now returns an int instead of void. Ensure that all callers are updated to handle the new integer return type or document the expected behavior clearly.
|
@nihalzp |
|
Split up function into two according to purpose and removed redundant class variable The function Another minor change in this commit is the removal of the |
|
Restructured stress testing for flags for all sample data instead of specific data Changes were made to restructure the stress testing for different flags entirely. Instead of the script testing the different flags only if the CSV file has the aforementioned flag in its filename ("belgium_population_by_region_2022_plot_polygons.csv" will trigger the test for When stress testing the different flags, the script will first test all of the sample data files with no flags as per normal before proceeding to test all the sample data with one of the flags enabled such as |
|
Implement draft legend feature for inset polygon map |
nihalzp
left a comment
There was a problem hiding this comment.
Overall, it looks good.
Additionally, add const to variables that do not change over their lifetime to improve readability. It may also help the compiler optimise in some cases.
| auto first_element = unique_properties_map_.begin(); | ||
| id_header_ = first_element->first; |
There was a problem hiding this comment.
| auto first_element = unique_properties_map_.begin(); | |
| id_header_ = first_element->first; | |
| id_header_ = unique_properties_map_.begin()->first; |
|
|
||
| set_id_header(unique_properties_[0]); | ||
| unique_properties_map_ = extract_unique_properties_map(j); | ||
|
|
There was a problem hiding this comment.
Add an assertion here to ensure that unique_properties_map is not empty. While this is currently guaranteed by extract_unique_properties_map, future changes to its implementation may break this assumption.
| per_grid_cell_ = 1; | ||
|
|
||
| double total_area = total_inset_area(); | ||
| // double total_area = equal_area_projection_area_to_earth_area(total_inset_area()); | ||
| size_t multiplier_idx = 0; | ||
| while (per_grid_cell < 0.015 * total_area) { | ||
| per_grid_cell *= multipliers[multiplier_idx]; | ||
| while (per_grid_cell_ < 0.015 * total_area) { | ||
| per_grid_cell_ *= multipliers[multiplier_idx]; | ||
| multiplier_idx = (multiplier_idx + 1) % multipliers.size(); | ||
| } | ||
| std::cerr << "Total area: " << total_area << std::endl; | ||
| std::cerr << "Per grid cell: " << per_grid_cell << std::endl; | ||
| // per_grid_cell = earth_area_to_equal_area_projection_area(per_grid_cell); | ||
| const double grid_line_spacing = sqrt(per_grid_cell); | ||
| std::cerr << "Per grid cell: " << per_grid_cell_ << std::endl; | ||
| // per_grid_cell_ = earth_area_to_equal_area_projection_area(per_grid_cell_); | ||
| const double grid_line_spacing = sqrt(per_grid_cell_); |
There was a problem hiding this comment.
I think we should not have per_grid_cell_ as a data member of InsetState. A better design could that we have a function that returns the value of per_grid_cell_, with parameters such as the InsetState object, equal_area (indicating whether the area is for cartogram or equal area map), multipliers.
| // Display value per grid cell in billions/millions/thousands | ||
| std::string grid_cell_label = ""; | ||
| if (grid_cell_value >= 1000000000) { | ||
| int billions = grid_cell_value / 1000000000; | ||
| grid_cell_label = std::to_string(billions) + "B"; | ||
| } else if (grid_cell_value >= 1000000) { | ||
| int millions = grid_cell_value / 1000000; | ||
| grid_cell_label = std::to_string(millions) + "M"; | ||
| } else if (grid_cell_value >= 1000) { | ||
| int thousands = grid_cell_value / 1000; | ||
| grid_cell_label = std::to_string(thousands) + "K"; | ||
| } else { | ||
| grid_cell_label = std::to_string(grid_cell_value); | ||
| } | ||
|
|
||
| // Display total value in billions/millions/thousands with 1 decimal place | ||
| std::string total_label = "Total: "; | ||
| std::stringstream sstream; | ||
| if (total_value >= 1000000000) { | ||
| double billions = (double)total_value / 1000000000; | ||
| sstream << std::fixed << std::setprecision(1) << billions; | ||
| total_label += sstream.str() + "B"; | ||
| } else if (total_value >= 1000000) { | ||
| double millions = (double)total_value / 1000000; | ||
| sstream << std::fixed << std::setprecision(1) << millions; | ||
| total_label += sstream.str() + "M"; | ||
| } else if (total_value >= 1000) { | ||
| double thousands = (double)total_value / 1000; | ||
| sstream << std::fixed << std::setprecision(1) << thousands; | ||
| total_label += sstream.str() + "K"; | ||
| } else { | ||
| total_label += std::to_string(total_value); | ||
| } |
There was a problem hiding this comment.
It would be better if they were part of separate functions.
| printf "All *.geojson, *.csv, *.svg, *.png and *.ps files deleted.\n" | color $red | ||
| exit ${failed} | ||
| fi | ||
|
|
There was a problem hiding this comment.
At the end, it would be useful to display summary statistics grouped by flags. For example: 5 failures without any flag, 3 failures with the plot_polygons flag, etc. Additionally, for each failed test, we should display the exact command that can be run from the root folder to reproduce it.
|
@nihalzp
|
@adisidev
A lot of new changes have been pushed so I decided to make a new pull request for them. The changes that have been made are as follows:
read_geojson.cpp--output_equal_area_map,--plot_polygons,--export_preprocessed, and--export_time_report), note that--add-gridcommand testing has not been added as its related issue Add Grid to SVG Outputs #239 has not been addressed yetColorandInsetcolumns