-
Notifications
You must be signed in to change notification settings - Fork 92
FAQ
Q1. I tried installing/upgrading the Skill Metrics package using the following pip install commands but it did not work for me:
pip install SkillMetrics
or
pip install SkillMetrics --upgrade
Is there another way to install the package?
A1. You may want to first try installing the package as a superuser because of how permissions are set on your computer.
sudo pip install SkillMetrics
or
sudo pip install SkillMetrics --upgrade
Second, you might also try installing the package by downloading it directly from the GitHub web page and running the setup.py script in the top level folder:
python setup.py install
Note that depending upon how your pip is defined, you may be installing the package under Python 2.7 versus your latest Python 3. To find out which one, run pip with a request for the version:
% which pip
/opt/local/bin/pip
% pip --version
pip 19.1 from /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip (python 2.7)
To install under Python 3 you may have to use the pip3 command instead.
sudo pip3 install SkillMetrics
or
sudo pip3 install SkillMetrics --upgrade
Q2. A "pip install SkillMetrics --upgrade" doesn't install the latest update (1.2.1). It still installs version 1.1.8. Specifying the version to install produces the error:
"No matching distribution found for SkillMetrics==1.2.1"
A2. Conflicts may arise because of the installation of Python distributions such as Anaconda or web browser environments such as Google Colab. Try downloading the SkillMetrics code from the repository and importing it directly.
Open the SkillMetrics code repository and download the code as zip. Unzip and copy the skill_metrics folder to the location of your code. You can then
import skill_metrics as sm
Python will first look at the package folder at your code location before it will search in any other location on your computer so it will import the from the folder you downloaded. This works for most users.
Q1. Why does the Skill Metrics package not appear when I run Python from within Eclipse? I get the following error:
Traceback (most recent call last):
File "C:\Users\rochfordp\workspace\SkillMetrics\Examples\target1.py", line 40, in
import skill_metrics as sm
ImportError: No module named skill_metrics
A1. Check that the Eclipse configuration for Python Interpreters includes the directory within which packages are installed. For MacOS, the Skill Metrics package is installed in /Library/Frameworks/Python.Frameworks/Versions/2.7/lib/python2.7/site-packages. Alternatively, if you have downloaded the complete directory tree from GitHub you will need to include the root folder in the PYTHONPATH, e.g. /${PROJECT_DIR_NAME}. Instructions are given below for Eclipse Neon.2 Release (4.6.2).
PIP Package Installation
To make the change for a pip package installation in MacOS, perform the following steps:
- Open the Eclipse properties: Eclipse -> Preferences...
- In the Preferences window, open the PyDev drop down list (>PyDev) in the left hand panel.
- Open the Interpreters drop down list under PyDev (>Interpreters).
- Select Python Interpreter.
- The lower panel will now display the System PYTHONPATH. Select the New Folder button, navigate to the folder containing the Skill Metrics package, and open it.
- The latest selected folder will now appear at the bottom of the panel.
- Click the OK button.
The SkillMetrics package should now be found by the calling Python script.
Source Folder Installation
To make this change for a source folder installation in MacOS or Windows, perform the following steps:
- Select the top level folder in the Navigator or PyDev Package Explorer panel.
- Right click to pull up the context menu and select Properties. This will open the Properties window.
- Select PyDev - PYTHONPATH and then the "Source Folders" tab menu.
- Click on the "Add source folder" button on the right side, navigate to the folder containing the Skill Metrics package (e.g. SkillMetrics), and open it.
- The latest selected folder will now appear at the bottom of the panel.
- Click the OK button.
The SkillMetrics package should now be found by the calling Python script.
Q1. Why does the Skill Metrics package report the following error when plotting symbol markers on the diagrams?
Traceback (most recent call last):
File "/Users/Peter/git/SkillMetrics/Examples/target7.py", line 112, in <module>
markerSize = 10, alpha = 0.0)
File "/Users/Peter/git/SkillMetrics/skill_metrics/target_diagram.py", line 75, in target_diagram
`plot_pattern_diagram_markers(RMSDs,Bs,option)`
File "/Users/Peter/git/SkillMetrics/skill_metrics/plot_pattern_diagram_markers.py", line 65, in plot_pattern_diagram_markers
rgba = clr.to_rgb(option['markercolor']) + (alpha,)
AttributeError: 'module' object has no attribute 'to_rgb'
A1. This problem occurs when using an earlier version of matplotlib than 2.0.0. The latter is distributed with Python 3.6 but standard distributions of Python 2.7 are usually distributed with earlier versions of matplotlib (e.g. 1.3.1 and 1.5.0). To check what version of matplotlib you are using execute the following statements at the Python command line (or within a script):
% /usr/bin/python
Python 2.7.10 (default, Jul 14 2015, 19:46:27)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import matplotlib
>>> matplotlib.__version__
'1.3.1'
>>> ^D
You can resolve this problem by upgrading to matplotlib 2.0.0 using the following pip command:
% pip install --upgrade matplotlib
or
% pip install -U matplotlib
A check on the version after installation completes should yield the following result:
% python
Python 2.7.10 (v2.7.10:15c95b7d81dc, May 23 2015, 09:33:12)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import matplotlib
>>> matplotlib.__version__
2.0.0
>>> ^D
Q2. In the rare case where two model predictions compared against observations fall on the same point, is there a way for the two symbols to be seen on the target and Taylor diagrams?
A2. You can control the face color of the marker symbols using the alpha option to change the blending of symbol face color (0.0 transparent through 1.0 opaque). Try setting the face color to transparent (alpha = 0). This will allow symbols of different shape to be seen when lying over one another. Examples of how to do this can be found in Target Diagram Example 7 and Taylor Diagram Example 9.
Q3. Is there an easy way to check whether points within a target or Taylor diagram occupy the same location?
A3. Use the check_duplicate_stats function to find the pairs of points that reside within a certain percentage distance of each other (default is 1%), and then use the report_duplicate_stats function to report these pairs of points. Examples of their usage can be found in Target Diagram Example 7 and Taylor Diagram Example 9.
Q4. Is there a way to easily change the size of the fonts on the target and Taylor diagrams?
A4. This is possible using the matplotlib customization function matplotlib.rc within your main program to dynamically change the font size of all the text. For example, to set all the text to use a font size of 11 pt include the following pair of statements:
import matplotlib as mpl
mpl.rc('font', size=11)
Note that applying other features such as
mpl.rc('legend', fontsize='large')
will not work because this property is explicitly overridden in the function plot_pattern_diagram_marker. Of course, you can always choose to edit the functions in the SkillMetrics functions to tailor the diagram properties to your preferences.
Q5. Is there a way to retrieve the earlier matplotlib 1.x font style for the text?
A5. This is possible using the matplotlib 2.0 style use function within your main program to change the font style to classic.
import matplotlib as mpl
mpl.style.use('classic')
Q6. When I try to run the taylor2.py example program from the command line in the Scientific PYthon Development EnviRonment (Spyder) it doesn't use the same graphics as what appears in the plot examples. How to correct this problem?
A6. Implementing the following lines at the beginning of the example program apparently corrects the problem:
import matplotlib
matplotlib.use('TkAgg')
Q7. When I produce a diagram containing small values for the ticks on the axes I get a messy looking diagram with tick values containing many significant values, e.g. 0.0300000000000013. How can I limit the values to a much smaller number of significant figures, e.g. 0.03?
A7. This problem typically arises when using the numpy arange function to produce a range of values for the axes tick values. The easy way to correct the problem is to round off the resulting values with the numpy around function and then provide these values to the target_diagram or taylor_diagram function, e.g.
ticks = np.around(np.arange(-0.03,0.031,0.01),2))
sm.target_diagram(bias,crmsd,rmsd, markerLabel = label, ticks = ticks)
This problem should also now be resolved automatically with the latest version of the package.
Q8. The graphs generated through the SkillMetrics Python package show individual model RMSD values on my Taylor diagram that are different from the RMSD values calculated from my raw CSV data. Why?
A8. The RMSD contours displayed on the Taylor diagram are centered RMSD. The convention for Taylor diagrams is to use the "RMSD" label rather than "CRMSD". As a consequence, many users mistakenly calculate these centered RMSD using the "rmsd" function rather than the "centered_rms_dev" in the SkillMetrics package:
rmsd = sm.rmsd(pred,obs) # root-mean-square deviation
crmsd = sm.centered_rms_dev(pred,obs) # centered-root-mean-square deviation
Make certain you are calculating centered RMSD values from your raw data when wishing to compare against the values plotted in the Taylor diagram. Note that the taylor_statistics function returns CRMSD values and not RMSD.
Q9. The centered RMSD values values calculated from my raw CSV data do not appear on the Taylor diagrams where they should relative to the contours. Why?
A9. Check whether you are calculating the statistics for each model using a different set of observations for each model. That is the set of data for each model has its own unique set of observation data to which it is paired. This is a mistake commonly made by users inexperienced in the use of these diagrams. A Taylor diagram is only meaningful if you calculate all your model predictions with respect to the same set of observations.
The key point to understand is that in the Taylor diagram the Centered RMSD (CRMSD) contours have an origin that is specified by the value of the standard deviation (SDEV) of the observations. If different observations are used for the statistics calculated for the differences between each model and observation, then the standard deviation of the observations changes for each model point. The RMSD contours change position accordingly as do the locations of the model points in the Taylor diagram. Comparing the location of the model points relative to the observation RMSD point on the x-axis then becomes meaningless as it is not a constant reference. The underlying Taylor criterion is not satisfied and the value of the diagram is lost.
To visually illustrate, I’ve created three Taylor diagrams below using model predictions that each have separate observations and for which the statistics were calculated using these data. In each Taylor diagram the standard deviation of the observations paired with a different model is used: M3, M5, and M9. The statistics used in the creation of each Taylor diagram is shown above the diagram with "N" column showing the number of observations. The "Obs" line shows the observation standard deviation used in the Taylor diagram (sdevr) and serves to identify the model predictions to which it is paired. As these figures clearly show, the RMSD contours in the Taylor diagrams change, the position of the points relative to the contours are different in each case, and it is not straightforward to draw a conclusion as to the best model result.
Model N crmsd ccoef sdevf sdevr 0 Obs 1868 0.000000 1.000000 1.440221 1.399578 1 M1 2588 0.777253 0.769187 1.130161 1.156618 2 M2 1183 0.829660 0.535244 0.883222 0.835697 3 M3 1868 0.980484 0.761945 1.440221 1.399578 4 M4 301 1.354128 0.798773 1.706210 2.246014 5 M5 299 1.602221 0.881960 0.479432 2.009045 6 M6 286 1.586683 0.866994 0.508558 2.007231 7 M7 184 0.687430 0.754403 0.619057 1.021486 8 M8 230 0.406422 0.937196 1.164650 1.079176 9 M9 828 0.673950 0.752989 1.015737 0.851245
model N crmsd ccoef sdevf sdevr 0 Obs 299 0.000000 1.000000 0.479432 2.009045 1 M1 2588 0.777253 0.769187 1.130161 1.156618 2 M2 1183 0.829660 0.535244 0.883222 0.835697 3 M3 1868 0.980484 0.761945 1.440221 1.399578 4 M4 301 1.354128 0.798773 1.706210 2.246014 5 M5 299 1.602221 0.881960 0.479432 2.009045 6 M6 286 1.586683 0.866994 0.508558 2.007231 7 M7 184 0.687430 0.754403 0.619057 1.021486 8 M8 230 0.406422 0.937196 1.164650 1.079176 9 M9 828 0.673950 0.752989 1.015737 0.851245
model N crmsd ccoef sdevf sdevr 0 Obs 828 0.000000 1.000000 1.015737 0.851245 1 M1 2588 0.777253 0.769187 1.130161 1.156618 2 M2 1183 0.829660 0.535244 0.883222 0.835697 3 M3 1868 0.980484 0.761945 1.440221 1.399578 4 M4 301 1.354128 0.798773 1.706210 2.246014 5 M5 299 1.602221 0.881960 0.479432 2.009045 6 M6 286 1.586683 0.866994 0.508558 2.007231 7 M7 184 0.687430 0.754403 0.619057 1.021486 8 M8 230 0.406422 0.937196 1.164650 1.079176 9 M9 828 0.673950 0.752989 1.015737 0.851245
Q10. How to add more radial RMSD contours?
A10. You can specify the RMSD contours using the tickRMS option as well as control the precision of the numeric labels using the rmsLabelFormat option:
sm.taylor_diagram(sdev,crmsd,ccoef, tickRMS = np.arange(0,60,20),
rmsLabelFormat = ':.1f')
Q11. Is it possible to limit the correlation range within a Taylor diagram when there is a cloud of data with similar correlation values so as to make the plot clarity better?
A11. It is not possible to limit the correlation range because the Taylor diagram is based on the similarity of the equation relating the various statistics with the Law of Cosines (see Taylor Diagram Primer document). Furthermore, if your data points are that close together one has to question whether the differences are significant.
Q12. When we look at the Taylor diagram, we clearly see the correlation value of 0.5 does not fall on the 45-degree mark. Why does this happen? How do you calculate the angular ticks and labels? Furthermore, if we do not need isolines representing RMSE, do we need to calculate RMSE at all, or do we use STD and correlation to plot points?
A12. Refer to the equation at the bottom of page 2 of the Taylor Diagram Primer document for how the angle is calculated in a Taylor diagram.
Strictly, only the STD and correlation values are needed to plot points in the Taylor diagram. The taylor_diagram method still requires you provide the RMSE, as this information is needed when using the color bar options to color code the markers positioned at the points. Note that the RMSE will be calculated as a matter of course when using the taylor_statistics function.
The SkillMetrics package provides the option to suppress the isolines representing RMSE as shown in Taylor Diagram Example 13.
Q13. How can I adjust the legend size and location?
A13. The legend size and many other properties can be adjusted using the matplotlib customization function matplotlib.rc within your main program. For example, if your original figure is as given by the Taylor 10 example:
then the following statements
import matplotlib as mpl
# Change legend whitespace to the border
mpl.rc('legend', borderpad=1.0) # default is 0.4
# Change vertical space between the legend entries
mpl.rc('legend', labelspacing=1.0) #default is 0.5
# Change the spacing between columns in the legend
mpl.rc('legend', columnspacing=0.75) #default is 2.0
# Draw a shadow behind legend
mpl.rc('legend', shadow=True) #default is False
# Make legend box blue
mpl.rc('legend', edgecolor='b') #default is background patch boundary color
will change the legend appear as that shown below
Information on the legend parameters that can be customized is provided in the matplotlib documentation for legend and legend_handles as well as the tutorial for style sheets (see LEGEND part of default matplotlibrc file).
Unfortunately, trying to control the legend font size and legend location
mpl.rc('legend', fontsize='large')
mpl.rc('legend', loc='center right')
will not work because the former is explicitly overridden and the latter is controlled by the bbox_to_anchor keyword in the function plot_pattern_diagram_marker. Of course, you can always choose to edit the functions in the SkillMetrics functions to tailor the diagram properties to your preferences.
Q1. When I try to run the taylor2.py example program I encounter problems with loading the taylor_data.pkl data file. How to fix this problem?
A1. Pickling is a way to convert a python object (list, dict, etc.) into a character stream. The idea is that this character stream contains all the information necessary to reconstruct the object in another python script. A problem with loading the pickle file may occur if you downloaded the .pkl file directly from GitHub. The file may get downloaded in html format rather than pkl. Check that the downloaded file has a binary format and if not, then download the file again making sure it preserves its native file format.
Q2. How to see the contents of the pickle files used in the target and taylor example scripts?
A2. You can find out the contents of what has been loaded from a pkl file using the type() and vars() functions on the object loaded from the file. Some example code for the pkl file used for the example target diagram scripts is included below.
# Simple Python script
import pandas
import pickle
from pprint import pprint
class Container(object):
`def __init__(self, pred1, pred2, pred3, ref):`
`self.pred1 = pred1`
`self.pred2 = pred2`
`self.pred3 = pred3`
`self.ref = ref`
if __name__ == '__main__':
`# Read data from pickle file`
`with open('target_data.pkl3', 'rb') as f:`
`object = pickle.load(f)`
`# Report data type of object loaded from pickle file`
`print('type(object): ', type(object))`
`# Print the variables of the object loaded from pickle file`
`pprint(vars(object))`
Q3. How can I convert a pkl file to csv using python?
A3. There are many solutions for this on the Internet and your particular implementation will depend upon the contents of the pickle file. Enclosed below is a simple solution using pandas for the pkl file used for the example target diagram scripts.
import pandas
import pickle
class Container(object):
`def __init__(self, pred1, pred2, pred3, ref):`
`self.pred1 = pred1`
`self.pred2 = pred2`
`self.pred3 = pred3`
`self.ref = ref`
if __name__ == '__main__':
`# Read data from pickle file`
`with open('target_data.pkl3', 'rb') as f:`
`object = pickle.load(f)`
`# Write each data structure to a CSV file`
`df = pandas.DataFrame(object.pred1)`
`df.to_csv(r'pred1.csv')`
`# Write each data structure to a CSV file`
`df = pandas.DataFrame(object.pred2)`
`df.to_csv(r'pred2.csv')`
`# Write each data structure to a CSV file`
`df = pandas.DataFrame(object.pred3)`
`df.to_csv(r'pred3.csv')`
`# Write each data structure to a CSV file`
`df = pandas.DataFrame(object.ref)`
`df.to_csv(r'ref.csv')`
Q4. How can I read in data from a Comma, Separated, Value (CSV) file.
A4. Refer to Reading Data.
Q1. How how can I have my code changes added to the SkillMetrics package?
A1. New options and capabilities to the package are always welcome. Submit a pull request from your forked branch after having exercised due diligence by performing Regression Testing. I will try to consider the pull request in due course. Please be patient, as I support this in my personal time as a hobby and available time can be at a premium with my employment obligations.