# Introduction
Data set includes responses to questionnaires sent to the CDP organization by companies and city authorities and dealing with information related to climate change and water consumption management (the latter topic has an impact only on companies).
It is requested to link data with social or environmental performance of territories (states, regions, municipalities). This link can be hard to be found because of some aspects, including:
* data analyzed may be distorted and may not represent the overall situation (responses may come from companies more committed, performing better results, with larger dimensions or experiencing more pressure on transparency)
* measuring the impacts of companies’ (or of city authorities’) activities is much more difficult than measuring results of activities
* linking results of companies/city authorities is difficult without defining a model able to link causes with effects.
Nonetheless, we can connect questionnaires data with 3 social, economic and environmental indicators (life expectancy at birth, percentage of premature births and a measure of income inequality) obtaining interesting conclusions.

# Companies data – Climate change

The data relating to the climate change questionnaire includes the responses to the questionnaires provided by 1278 companies, of which 1134 from the USA and 144 from Canada.
![Image01.png](attachment:Image01.png)

720 companies have sent the questionnaire the last 2 or 3 year; probably this is because these companies are more committed on climate change topics.
![Image02.png](attachment:Image02.png)

In order to better analyze data, we suggest to examine data relating to the questionnaires because there are more data to analyze; in total we can use the data of 2595 questionnaires relating to 2 countries, 3 years, 13 industrial sectors.
![Image03.png](attachment:Image03.png)

It is also useful to analyze questionnaires based on why companies respond to the CDP and the type of questionnaire requested. The compilation of the questionnaire can be requested by investors or by customers (named “supply chain questionnaires”); in this second case, the questionnaire is simplified. The questionnaires requested by investors or by investors or customers are 62% out of the total (1613 questionnaires).
Moreover, for some industrial sectors CDP provides specific questions to better analyze the company's performance: 854 questionnaires have such sectoral view (in 15 different sectors), 1718 questionnaires have the standard questionnaire and for 41 questionnaires the type of questionnaire is not indicated.
![Image04.png](attachment:Image04.png)

Analyzing data relating to the start date of the questionnaire compilation we can see that the answers have a peak that probably corresponds to the expiry date of the questionnaire. We can hypothesize that companies that answer the questionnaire before the expiry date have more commitment on issues related to climate change.
![Image05.png](attachment:Image05.png)

The questionnaire is divided into 8 sections concerning general information about the company, description of the information management system related to climate change, risk and opportunity management, quantitative data relating to emissions and other information; in 2020 questionnaire, total questions were 55 and the individual data points were 220.
![Image06.png](attachment:Image06.png)

In total, the answers contained in the questionnaires contain around 2 million data points. On average, each completed questionnaire contains 792 data points; the average of information contained in a questionnaire varies in relation to the motivation for completing the questionnaire (the questionnaires requested by investors contain more information), to the specific questionnaires for the various sectors (the questionnaire relating to financial companies is the one with the fewest dates average points) and to years (each year the requests may vary).
![Image07.png](attachment:Image07.png)

The average number of data points is lower for the questionnaires requested by customers and in some sectors the average improves by the data relating to the "breakdown" (breakdown of emissions) and "energy" sections (detailed data relating only to companies in the energy sector ). Excluding these aspects, data are more homogeneous.
![Image08.png](attachment:Image08.png)

About half of the answers are blank. Answers breakdown by questionnaire sections varies if blank answers are excluded. On average, each questionnaire (excluding those requested by customers) contains 367 responses. The average number of responses can be used to assess the completeness of the questionnaires. A detailed compilation of the section on targets, for example, can demonstrate the maturity of companies. The presence of answers in the emissions section is also important, because this section contains one of the most important information: quantification of company's greenhouse gas emissions.
![Image09.png](attachment:Image09.png)

Finally, we can deepen a simplified subset of data by excluding the questionnaires requested by customers, the sections of the questionnaire "breakdown" and "energy" and blank data.
Compared to the initial approximately 2 million data points, the analyzed data set includes 473 thousand of data points.
![Image10.png](attachment:Image10.png)
![Image11.png](attachment:Image11.png)

In the questionnaire section dedicated to emissions data, direct (scope 1) and indirect greenhouse gas emissions are quantified. Indirect emissions are divided into two categories: those linked to energy consumption (scope 2) and those not linked to energy consumption (scope 3); the latter cannot be summed because can contain duplications.
In total, 2020 questionnaires analyzed include 1531 million tons of CO2e direct emissions and 324 million tons of CO2e indirect emissions related to energy consumption. Total emissions vary across sectors.
![Image12.png](attachment:Image12.png)

By combining information collected we can calculate an indicator of questionnaire quality. In particular, we can consider the number of questionnaires sent in the 3 years, the submitting date of the questionnaires, the average number of answers in the target section, the presence of a quantification of greenhouse gas emissions. The indicator will range from a minimum of zero to a maximum of 24 points.
![Image13.png](attachment:Image13.png)

Using data relating to companies’ headquarters, we can calculate the quality index also by country and by state. Using data relating to the total greenhouse gas emissions of the various territories, we can finally calculate the coverage of the emissions reported in the questionnaires with respect to the total territorial emissions: states with low coverage percentages will have fewer answers.
The following matrix presents the summary of the analysis. In the upper right quadrant we find states that have high quality questionnaire responses and high coverage with respect to the state's greenhouse gas emissions. In the upper left quadrant, we find states that have a high coverage of territorial emissions but a low questionnaires quality of: the companies located in these states should improve the compilation of the questionnaire. In the lower right quadrant we find states with a high questionnaires quality but with a low coverage of territorial emissions: in this case it is necessary to increase the number of companies that respond to questionnaire. Finally, in the lower left quadrant we find states with low quality of completed questionnaires and low coverage of total emissions: in this case it is desirable to increase both the number and the quality of the answers.
5 states are in the upper right quadrant, 8 in the lower right quadrant and 14 in the lower left.
The size of circles can be used to understand how the various states are arranged and to link data with 3 social, environmental and economic indicators. For this analysis we will use the life expectancy at birth (which can be linked to the social maturity of the territory), the percentage of pre-birth births (which can be linked to the environmental situation of the territory) and an economic indicator that measures the inequality in distribution of income (the Gini coefficient).
The most populous states are located on the left side of the matrix, with a quality level of responses below average.
![Image14.png](attachment:Image14.png)

The states found in the upper left quadrant account for approximately 12% of total emissions.
![Image15.png](attachment:Image15.png)

Let us now analyze the relationship between responses and some indicators that explain social, environmental and economic situation of the territories.
If we consider the life expectancy at birth, we can see that there is no relationship between best answers and social situation of the territories: in the upper right quadrant there is only one state among those in the best quintile  and one in the second best quintile, in terms of life expectancy at birth.
![Image16.png](attachment:Image16.png)

Even considering the indicator of the percentage of premature births (and assuming that this indicator is connected with the environmental situation of the territory), it can be seen that in the upper right quadrant there are states with worse performances (the larger circles correspond to lower values of the percentage of premature births). No country in the best quintile and in the second best quintile are found in the upper right quadrant.
![Image17.png](attachment:Image17.png)

Considering an indicator of inequality in the distribution of income, in the upper right quadrant there are no states in the best quintile and in the second best quintile (the larger circles correspond to lower values ​​of the income concentration index).
![Image18.png](attachment:Image18.png)

# Conclusions
Analyzing data of 1278 companies responses to the questionnaire relating to climate change, we can highlight a good coverage of total greenhouse gas emissions of territories (the companies that respond to the questionnaire represent about 30% of the total emissions) and a good quality level of responses (the quality index has an average value of 18, on a scale ranging from 0 to 24).
If we compare the quality and completeness of the questionnaires with indicators that measure economic, social and environmental territories maturity, we note however that the states with the best performance have an insufficient level of response to the questionnaire from a qualitative and quantitative point of view. This conclusion is influenced by questionnaires’ sectoral composition: companies in the power generation sector represent 37% of total emissions of the states located in the upper right quadrant, have a very high share of direct emissions (30% of the total) and a better than average quality questionnaires’ score (19 compared to the average of 18). In the states located in the lower left quadrant, on the other hand, there are many more companies that belong to sectors other than power generation and that have lower quality levels of response to the questionnaire.
In conclusion:
* for states located in the lower part of the matrix, the number of companies that respond to the CDP questionnaire should be increased
* for states located in the left part of the matrix, the qualitative level of responses to the questionnaire by the companies that already respond should be improved
* improvements should be greater in states with better environmental, social and economic performance
* for companies in sectors other than power generation, the qualitative level of responses should be improved, in particular in terms of analysis and quantification of indirect emissions.