### Short Free Text Variable Coding plus Documentation

Apart from `v_3`, `v_19`, and `v_20`, which are list-supplanting, all variables we are dealing with here are list-supplementing in some sense (although `v_194` needs special treatment).

___

**Preliminary Thoughts**

General problems with the short free text variable coding for list-supplementing answers:
* if we assign one of the suggested answers to a free text answer because we think it belongs there, we're - in a way - modifying the will of our respondent
* if we create new codes, we don't know whether other respondents would have picked one of these codes instead of their original choice because they simply picked the most suitable of the presented choices (rather than being more precise in a free text answer)

Against this background, one general suggestion could also be:
* Use all 'Other (please specify)' answers (i.e., free text answers to list-supplementing questions) to improve future surveys only
* Ignore the 'Other (please specify)' answers when 'blocking' on the variable in question

Follow-up problem: 
* When searching for correlations between two variables with free text supplements, only respondents making list choices for *both* variables can be considered
* Drastically reduces the number of available responses for some variables, e.g., the business sectors

___

**Reasonable, Rule-Based Compromise**

* `<= 5 %` of 488 free text responses: drop variables concerned without replacement: `Drop`
* `> 5 %` and `<= 10 %` of 488 free text responses: code back where possible, drop rest, don't create new codes: `Code Back`
* `> 10 %` of 488 free text responses: code back where possible, group rest, create new codes: `New Tags`

`v_194` receives special treatment since it is somewhat different (and is dropped afterwards).

**Plus:** Go through all dropped variables and identify opportunities to improve the next survey.

___
___

### Table of Contents and Overview

| Variable | Type (if not Supplementing) | Answers | Suggestion | Responsibility |
| --- | --- | ---: | --- | --- |
| [Variable 2](#v_2) | | 87|`New Tags`| DMF |
| [Variable 3](#v_3) |Supplanting | 488|(DONE) `New Tags`| CC |
| [Variable 5](#v_5) | |29 |`Code Back`| DMF |
| [Variable 15](#v_15) | |5 |(with `v_14`) `Drop`| - |
| [Variable 18](#v_18) | | 40|`Code Back`| MF
| [Variable 19](#v_19) | Supplanting|488 |(DONE) `New Tags`| CC |
| [Variable 20](#v_20) | Supplanting|403 |`New Tags`| JC |
| [Variable 22](#v_22) | | 13|`Drop`|- |
| [Variable 35](#v_35) | | 16|`Drop`|- |
| [Variable 46](#v_46) | | 24|`Drop`|- |
| [Variable 52](#v_52) | | 18|`Drop`|- |
| [Variable 60](#v_60) | | 27|`Code Back`|JC |
| [Variable 67](#v_67) | | 13|(with `v_66`) `Drop`|- |
| [Variable 81](#v_81) | | 15|(with `v_80`) `Drop`|- |
| [Variable 96](#v_96) | | 22|`Drop`|- |
| [Variable 105](#v_105) | | 11|(with `v_104`) `Drop`|- |
| [Variable 113](#v_113) | | 43|(also `Drop v_112`) `Code Back`|MF|
| [Variable 164](#v_164) | | 16|`Drop`|- |
| [Variable 166](#v_166) | | 18|`Drop`|- |
| [Variable 194](#v_194) |Special | 191|`New Tags`|DMF, MK

**Note**

The way the survey is constructed (**sigh**), there are in fact two types of list-supplementing short free text variables:
* Type I variables where `Other` is part of a single-selection list in a previous variable
* Type II variables where `Other` can be selected as part of a multi-selection question, i.e., there is a separate variable that merely holds the information of whether a respondent selected `Other`

**Consequences**

* In the case of Type II variables, we will want to remove the *`Other` switch variable* after executing our free text coding decision since it is only cluttering the data.
* The data coding and integration procedures for both types of variables will differ:
  - Type I variables: add codes in separate column and perform merge on `lfdn` (simple)
  - Type II variables: add codes in separate column and overwrite values in (copy of) df where code matches variable name (in the case of `Code Back`), introduce new column and set values (in the case of `New Tags`) (less simple)

### Preparations and Explorations

In [2]:
import pandas as pd
import re, os

In [3]:
df = pd.read_csv('../../data/napire_truth.csv', sep=';')
df.head(2)

Unnamed: 0,lfdn,v_300,v_23,v_3,v_1,v_2,v_4,v_5,v_6,v_7,...,v_279,v_280,v_281,v_282,v_283,v_284,v_285,v_286,v_194,v_297
0,30,English,Japan,7,Agriculture,NotAnswered,Software-intensive embedded systems,NotAnswered,not quoted,quoted,...,NotShown,NotShown,NotShown,NotAnswered,NotAnswered,NotShown,NotShown,NotShown,Nothing,Since systems we developed face the natural en...
1,33,English,Argentina,5,Manufacturing,NotAnswered,Software-intensive embedded systems,NotAnswered,quoted,not quoted,...,NotAnswered,NotAnswered,NotAnswered,NotAnswered,NotAnswered,NotAnswered,NotAnswered,NotAnswered,NotAnswered,NotAnswered


In [4]:
df.shape

(488, 165)

In [5]:
basedir = '../../data/freetext'
short_vars = sorted([x for x in os.listdir(basedir) if 'short' in x],
      key=lambda x:int(x.split('_')[1]))

In [6]:
short_vars

['v_2_short_87.csv',
 'v_3_short_488.csv',
 'v_5_short_29.csv',
 'v_15_short_5.csv',
 'v_18_short_40.csv',
 'v_19_short_488.csv',
 'v_20_short_403.csv',
 'v_22_short_13.csv',
 'v_35_short_16.csv',
 'v_46_short_24.csv',
 'v_52_short_18.csv',
 'v_60_short_27.csv',
 'v_67_short_13.csv',
 'v_81_short_15.csv',
 'v_96_short_22.csv',
 'v_105_short_11.csv',
 'v_113_short_43.csv',
 'v_164_short_16.csv',
 'v_166_short_18.csv',
 'v_194_short_191.csv']

In [7]:
def get_var_file(varno:int, lfdn=False, df=False):
    basedir = '../../data/freetext'
    short_vars = sorted(
        [x for x in os.listdir(basedir) if 'short' in x],
          key=lambda x:int(x.split('_')[1]))
    this_file = [x for x in short_vars if int(x.split('_')[1]) == varno][0]
    print(this_file)
    if df:
        return pd.read_csv(f'{basedir}/{this_file}')
    elif not lfdn:
        return pd.read_csv(f'{basedir}/{this_file}')[f'v_{varno}'].values
    else:
        return pd.read_csv(f'{basedir}/{this_file}').values

In [8]:
def prepare_for_excel_people(var, varname):
    varx = get_var_file(var, df=True)
    varx[f'v_{var}_lower'] = [x.lower() for x in varx[f'v_{var}']]
    varx['tag'] = ''
    varx.sort_values([f'v_{var}_lower', 'lfdn']
                    ).drop(f'v_{var}_lower', axis=1).to_csv(
            f'../../data/freetext_tocode/v_{var}_{varname}_{len(varx)}.csv', sep=';', index=False)

___
___

### Variable 2<a id="v_2"></a>

#### Please select the main industrial sector of your project and the application domain of the software you build.

Options presented in `v_1`:
1. Agriculture
2. Automotive
3. Finance
4. Healthcare
5. Security
6. Manufacturing 
7. Energy
8. Logistics
9. Railway
10. Avionics
12. Insurance
13. Education
14. Public sector
15. Enterprise resource planning 
16. Human resources
17. e-Government
18. Telecommunication
19. Games engineering
20. Public transportation
21. e-Commerce
22. Other (please specify)

`v_2` used to specify `Other`.

Potential coding principles:

* Aviation, Aerospace, Defence, Defense -> `Avionics`
* Bank, Banking Sector, Financial Market -> `Finance`
* Medical Research -> `Health`
* ERP Product -> `Enterprise resource planning`
* Insurance -> `Insurance`
* Rename Telecommunication to `ICT` and group with the Telecommunication answers: Communications, 
* **New** category `Media`: Film, Television, Radio
* **New** category `Food`
* **New** category `Consulting`
* **New** category `Diverse`
* ...

___

##### Suggestion: `Code Back` then `New Tags` then `Drop` `v_2`

In [21]:
#prepare_for_excel_people(2, 'sectors')

v_2_short_87.csv


In [82]:
set(x.lower() for x in get_var_file(2))

v_2_short_87.csv


{' waste collection and management',
 '1 project education + 1 project ecommerce ; industries of project varies ',
 'advertising',
 'aerospace',
 'automation',
 'aviation',
 'banking',
 'banking sector',
 'big science (particle physics research facilities)',
 'business software',
 'calendar app',
 'callcenter',
 'cloudservices for small and midsize business',
 'communications',
 'consulting',
 'consumer industry',
 'contact centers',
 'customer engagement ',
 'data analysis and visualization ',
 'defence',
 'defense',
 'developed in the context of a product of diverse applicability',
 'development of banking software',
 'digital out of the home',
 'domain-independent tool support for engineering and assurance of safety-criticla systems',
 'environment',
 'erp product',
 'film',
 'financial market',
 'food industry',
 'food industry - custom software',
 'fraud detection',
 'gastronomy',
 'gdpr - general data protection regulation',
 'grant management',
 'high tech',
 'hmi',
 'hobby, ent

___
___

### Variable 3<a id="v_3"></a>

#### How many people are involved in your project?

Coding Mode: Manual

Coder: CC

Note: Most entries were already integers. Where ranges were specified, I went with the lower bound.

___

###### Suggestion: Replace `v_3` by its coded version in the coded truth.

___
___

### Variable 5<a id="v_5"></a>

#### Please select the class of systems or services you work on in the context of your project.
Options presented in `v_4`:
* Software-intensive embedded systems
* Business information systems
* Hybrid of both software-intensive embedded systems and business
* Other (please specify)

`v_5` used to specify `Other`.
___

##### Suggestion: `Code Back` then `Drop`  `v_5`

In [24]:
#prepare_for_excel_people(5, 'systems_class')

v_5_short_29.csv


In [101]:
get_var_file(5)

v_5_short_29.csv


array(['Electrical lines', 'Hardware subsystem incl. SW',
       'Power electronics control with FPGA.',
       'varies based on the project, and I work on several at a time.  Mostly software to support state agency operations/services for Department of Health',
       'middleware', 'Railway Control Systems',
       'All aspects of IT support to the enterprise - infrastructure, applications, user interface, storage, etc.',
       'Software Intensive Embedded Systems, FPGA, IoT, Hybrid of last 3 entries',
       'PLM', 'Web-Application', 'Integrations between systems ',
       'applications for mobile devices',
       'applications for mobile devices',
       'IT Infrastructure and Security', 'AZURE',
       'Backend support and administration', 'Software as a Service',
       'Access Control (SW and Installations)', 'Web system',
       'Software Analysis Suite', 'ERP', 'Point of sale software',
       'Maschine controlling software', 'Mobile applications',
       'E-Commerce Platform 

___

___

### Variable 15<a id="v_15"></a>

#### Are there quality attributes which are of particularly high importance for your development project? If yes, which one(s)?

Situation: 

Here, we have five answers only, i.e. `v_14` is quoted six times in the `df`, and five respondents specified the `Other` in `v_15`.

Of the five answers, three (`lfdn` 82, 84, and 560) don't really specify an `Other`. 

The remaining two answers are `scalability` (`lfdn` 512) and a phrase in Portuguese (`lfdn` 1894) along the lines of `modularity` and `developability` (in the sense of suitability for agile development).
The respondents involved also quoted many other quality requirements (see below).

___

###### Suggestion: `Drop` `v_14` and `v_15`

In [27]:
df[df.v_14 == 'quoted'].T.loc[['v_14', 'v_15']]

Unnamed: 0,11,12,121,129,134,466
v_14,quoted,quoted,quoted,quoted,quoted,quoted
v_15,Quality in works plays a main role.,This is reverse-engeneering project - old syst...,scalability,NotAnswered,Satisfying the job to be done,"Evolução rápida em módulos compartimentados, p..."


In [84]:
get_var_file(15)

v_15_short_5.csv


array(['Quality in works plays a main role. ',
       'This is reverse-engeneering project - old system has to be re-developed on the base of modern technologies',
       'scalability', 'Satisfying the job to be done',
       'Evolução rápida em módulos compartimentados, para permitir desenvolvimento de modulos com sprints paralelos geridos por equipas em paralelo'],
      dtype=object)

In [23]:
df[df.lfdn == 512].T.loc[[f'v_{x}' for x in range(6,16)]]

Unnamed: 0,121
v_6,quoted
v_7,quoted
v_8,quoted
v_9,quoted
v_10,quoted
v_11,quoted
v_12,not quoted
v_13,quoted
v_14,quoted
v_15,scalability


In [24]:
df[df.lfdn ==1894].T.loc[[f'v_{x}' for x in range(6,16)]]

Unnamed: 0,466
v_6,not quoted
v_7,quoted
v_8,not quoted
v_9,not quoted
v_10,not quoted
v_11,not quoted
v_12,quoted
v_13,quoted
v_14,quoted
v_15,"Evolução rápida em módulos compartimentados, p..."


___
___

### Variable 18<a id="v_18"></a>

#### What is the main role you occupy in your project?

Options presented in `v_17`:

1. Business Analyst
2. Customer
3. Developer
4. Project Lead / Project Manager 
5. Product Owner
6. Product Manager
7. Scrum Master
8. Architect
9. Test Manager / Tester 
10. Marketing
11. Requirements Engineer
12. Other (please specify)

`v_18` used to specify `Other`.

___

##### Suggestion: `Code Back` then `Drop`  `v_18`

In [9]:
#prepare_for_excel_people(18, 'respondent_role')

v_18_short_40.csv


In [85]:
get_var_file(18)

v_18_short_40.csv


array(['Observer', 'System analyst (requirements engineering included)',
       'PM SE RE Support Engineer (PMO)', 'Systems Architect',
       'Administrator', 'qa', 'Certification Engineer, System Engineer',
       'Senior System Analyst',
       'All in one solution, I do everything myself',
       'Requirements Tooling Strategy', 'Frontend Developer',
       'design engineer', 'Analyst and UX Designer', 'CEO', 'Analyst',
       'Agile Coach', 'CIO', 'PMO', 'Business Analyst',
       'Process Quality Analyst (QA)',
       'Product Owner and Requirements Engineer',
       'Hybrid  Analyst/Project manager', 'DBA', 'Team Lead',
       'Analyst and Developer', 'System Engineer', 'Requirements Manager',
       'Engineering manager, fulfilling the PM and some other roles.',
       'Data scientist', 'Senior management', 'Security Manager',
       'Process analyst', ' ', 'Development Functional Manager',
       'Release Train Engineer (acc to Scaled Agile Framework, SAFe)',
       'Engagemen

___
___

### Variable 19<a id="v_19"></a>

#### How many years of industrial experience do you have in your role?

Coding Mode: Manual

Coder: CC

Note: Where multiple values were given, I went with the total experience specified.
___
###### Suggestion: Replace `v_19` by its coded version in the coded truth.

___
___

### Variable 20<a id="v_20"></a>

#### Do you have a certification in this role? If yes, which one?
___

##### Suggestion: `New Tags` then `Drop`  `v_20` 

In [25]:
#prepare_for_excel_people(20, 'certifications')

v_20_short_403.csv


In [70]:
from collections import Counter

In [71]:
Counter(get_var_file(20))

v_20_short_403.csv


Counter({'#VALUE!': 1,
         '-': 3,
         'Agile Scrum master': 1,
         'BCS Business Analysis Foundation': 1,
         'BCS International Diploma in Business Analysis, IREB CPRE Foundation, Professional Scrum Master I, Professional Scrum Product Owner I': 1,
         'BEng & Doulos': 1,
         'BPM PROCESS ANALYST': 1,
         'BS in Computer Science': 1,
         "Bachelor's degree": 1,
         'Business System Analysis Certificate (BSAC), Schulich Business Analyst Masters Certificate': 1,
         'CBAP': 4,
         'CBAP, SAFe Agilist': 1,
         'CCBA': 1,
         'CERTIFICATES OF VARIOUS COURSES': 1,
         'CISCO': 1,
         'CPRE': 3,
         'CPRE (Foundation Level)': 1,
         'CPRE AL': 1,
         'CPRE Advanced Level (Modeling)': 1,
         'CPRE FL': 1,
         'CPRE Foundation': 1,
         'CPRE Foundation Level': 1,
         'CPRE-FL': 2,
         'CPREFL': 1,
         'CSM': 2,
         'CSM and CSPO': 1,
         'CSM/CPO': 1,
         'CS

___
___

### Variable 22<a id="v_22"></a>

#### Which organisational role does your project team have in your project?

Options presented in `v_21`:

1. Customer
2. Main contractor (main responsible for the development project)
3. Sub-contractor (responsible for part of a larger development project) 
4. In-house development
6. Other (please specify)

`v_22` used to specify the `Other`.

___

##### Suggestion: `Drop` `v_22`

In [86]:
get_var_file(22)

v_22_short_13.csv


array(['Mix: Customer and Main-Contractor',
       'Part Customer, part Main contractor, part In-house development',
       'Co-Development of project',
       'All in one solution, I do everything myself',
       'Consultant for Business Analysis and Project Management',
       'Quality control', 'SA', 'own business', 'test factory',
       'In_house: Business Analysis to Design', '  ',
       'It is In-house development, but composed of external consultants',
       'Service designer'], dtype=object)

___
___

### Variable 35<a id="v_35"></a>

#### How are the requirements elicited in your project?

Options: 

* v_28: We elicit and / or refine requirements in several iterations
* v_29: We elicit and / or refine requirements in a specifically dedicated project phase
* v_34: Other (please specify)
* v_35: Short Free Text Specification

Situation:

Here, we have 16 answers, and some of them might be interpreted as `v_28` or `v_29`, others might not.
Those answers that might not be plausibly 'coded back' into the options originally presented mainly refer to options presented as part of the next question (see below, `v_46`).

___

###### Suggestion: Drop `v_34` and `v_35` 

For the next survey, rethink the allocation of answer options to the individual questions regarding Requirements Elicitation (e.g., moving 'We do not elicit...ourselves' to this question).

In [36]:
df[df.v_34 == 'quoted'].T.loc[['lfdn'] + [f'v_{x}' for x in [28,29,34,35]]]

Unnamed: 0,29,34,42,61,152,166,218,228,337,345,386,398,404,441,449,476
lfdn,147,163,180,249,683,742,928,952,1321,1344,1457,1483,1542,1744,1763,1928
v_28,not quoted,quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted
v_29,not quoted,quoted,not quoted,not quoted,not quoted,not quoted,not quoted,quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted
v_34,quoted,quoted,quoted,quoted,quoted,quoted,quoted,quoted,quoted,quoted,quoted,quoted,quoted,quoted,quoted,quoted
v_35,We try to elicit requirements in dedicated pha...,Depends upon the project,Waterfall,I create my own projects and so I write my own...,HYBRID,Long waterfall process (6 months) which is it...,Based on the source code of the current versio...,Customer and vendor choosing analysis models a...,It is handled by the product manager,Work meetings,Workshops and forms,WORKSHOPS (several iterations),We will like to refine them in several iterati...,We hardly do,Requirements are usually elicited by project m...,This is taken by someone else. But we do make ...


In [74]:
get_var_file(35)

v_35_short_16.csv


array(['We try to elicit requirements in dedicated phase, but end up eliciting requirements througout the project on an ad-hoc basis.',
       'Depends upon the project', 'Waterfall',
       'I create my own projects and so I write my own requirements',
       'HYBRID', 'Long waterfall process (6 months)  which is iterated.',
       'Based on the source code of the current version of the product, adding improvements and corrections identified, assessing the impact of the project.',
       'Customer and vendor choosing analysis models and check its fit or resolve our business needs.',
       'It is handled by the product manager', 'Work meetings',
       'Workshops and forms', 'WORKSHOPS (several iterations)',
       'We will like to refine them in several iterations with the project team in an agile way, but that does hardly happen and in many cases requirements are either solely elicited from the PO and many times refined during the development (e.g. the sprint)',
       'We hardly do

___
___

### Variable 46<a id="v_46"></a>

#### Which techniques do you use for your requirements elicitations?

Options:

* v_36: Interviews
* v_37: Analysis of existing documents
* v_38: Risk analyses
* v_39: Prototyping
* v_40: Workshops and focus groups
* v_41: (Requirements) Reuse databases and guidelines
* v_42: Design Thinking / Lean Startup
* v_43: External experts
* v_44: Observations
* v_47: We do not elicit requirements (ourselves)
* v_45: Other
* v_46: Short Free Text to specify `Other`

___

##### Suggestion: `Drop` `v_45` and `v_46`

In [78]:
df[df.v_46 != 'NotAnswered'][['lfdn']+[f'v_{x}' for x in range(36,47)]].T

Unnamed: 0,12,18,20,48,58,61,62,162,185,204,...,290,356,386,404,434,441,459,470,476,487
lfdn,84,114,118,199,230,249,258,729,839,896,...,1161,1357,1457,1542,1717,1744,1840,1909,1928,1969
v_36,not quoted,not quoted,not quoted,not quoted,quoted,not quoted,quoted,quoted,quoted,quoted,...,quoted,quoted,not quoted,quoted,quoted,not quoted,not quoted,not quoted,not quoted,quoted
v_37,quoted,quoted,quoted,not quoted,quoted,not quoted,not quoted,quoted,quoted,not quoted,...,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,quoted
v_38,not quoted,not quoted,quoted,not quoted,quoted,not quoted,not quoted,quoted,not quoted,not quoted,...,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,quoted
v_39,not quoted,not quoted,not quoted,quoted,quoted,not quoted,not quoted,quoted,quoted,quoted,...,not quoted,quoted,quoted,not quoted,quoted,not quoted,not quoted,not quoted,not quoted,quoted
v_40,not quoted,not quoted,not quoted,not quoted,quoted,not quoted,not quoted,quoted,quoted,quoted,...,quoted,quoted,quoted,not quoted,quoted,not quoted,quoted,not quoted,not quoted,quoted
v_41,not quoted,quoted,quoted,quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,...,not quoted,quoted,not quoted,not quoted,quoted,not quoted,not quoted,not quoted,not quoted,not quoted
v_42,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,...,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted
v_43,quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,...,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted,not quoted
v_44,not quoted,not quoted,not quoted,not quoted,quoted,not quoted,quoted,quoted,quoted,not quoted,...,not quoted,quoted,not quoted,not quoted,quoted,not quoted,not quoted,not quoted,not quoted,quoted


In [72]:
get_var_file(46)

v_46_short_24.csv


array(['The main source of the requirements is the source program code of old system.',
       'stakeholder requests from customer',
       'requirement specification form customer', 'Reverse Engineering',
       'Many times, requirements also come from the customer. Although we still need to refine to assure we understand their needs/wants. ',
       'I have my own way of doing things. I combine all techniques from school and indy projects to understand what I am doing.',
       'Mockups',
       'Participation in the team of a specific profile with expertise in the \u200b\u200bindustry sector the project is developed',
       'BEAM*', 'Market analysis and analysis of similar products',
       'Early validation of development for corrections in flight where possible',
       'We are working together with the future stakeholder of the new product. The prototype is implemented in that way that we can use productive data in order to simulate the workflows. There requirements are formulat

___
___

### Variable 52<a id="v_52"></a>

#### Who has the primary responsibility for eliciting requirements?

Options presented in `v_51`:

1. Marketing
2. Business Analyst
3. Requirements Engineer
4. Project Lead / Project Manager 
5. Scrum Master
6. Product Owner
7. Product Manager
8. Architect
9. Developer
10. Customer
12. Other
13. Nobody has the primary responsibility

`v_52` used to specify `Other`.

___

##### Suggestion:  `Drop`  `v_52` 

In [98]:
get_var_file(52)

v_52_short_18.csv


array(['Engineering Manager ', 'Main contractor', 'System Analyst',
       'All in one solution, I create my own project and I am responsible for everything',
       'end user',
       "As 'Systems Design Engineer' I have primary responsibility for eliciting requirements; I also act as Product Owner for software development.",
       'Customers use to have the business requirement defined and documented at a high level + team implementing specific features digs deep with the support of Manager&ScrumMaster',
       'Business Analyst together with Project Manager and Heads of Development',
       'In contact with end users', 'functional analyst',
       'research team and UX / UI design', 'Systems Analyst',
       'Domain expert', 'All team members participate',
       'Consultants (no software development projects)',
       'Mostly the PO, in some cases the business stakeholders write the stories and they are passed like that to the team, or other POs write them when there are dependenc

___
___

### Variable 60<a id="v_60"></a>

#### At what level of granularity do you document requirements, and when?

Options presented in `v_59`:

1. We document detailed requirements at the beginning of the project.
2. We document high-level requirements at beginning of the project and refine them to detailed requirements when needed (for instance, we document epics and refine them to user stories for the sprints).
3. We do not document requirements.
4. Other (please specify)

`v_60` used to specify `Other`.

___

##### Suggestion: `Code Back` then `Drop`  `v_60` 

In [26]:
#prepare_for_excel_people(60, 'documentation_granularity')

v_60_short_27.csv


In [88]:
get_var_file(60)

v_60_short_27.csv


array(['we document detailed requirements through the analysis of the existing business logic implemented in the old system - component by component, already more than 1,5 year ',
       'We document high-level requirements, PMO expects this to be enough information and then we have to refine them during development, sometimes delaying the delivery of an increment.',
       'Actualy we have bouth first two point, we create detailed requirementes and we detalaize and improve them in the process',
       'I apply every technique that I know to decompose the problem state until I can visualize the requirements. ',
       'mix of epics for for the next year and detail them for the next 2 months. Challenge is, that we have an external company developing. So far we did plans for a year. So we needed to know what we wanted to do next year.',
       'Depends on the risk and the complexity, simple straightforward aspects high level, sw agile, complex aspect with lots of detail if needed',
     

___
___

### Variable 67<a id="v_67"></a>

#### How do you make use of the documented requirements?

Options presented in `v_61` to `v_66`:
* They are the basis for the implementation.
* They are source for tests.
* They are used in customer acceptance.
* They are part of the contract.
* They are a reminder for further discussions with the customer, product owner, and / or other team members
* Other (please specify)

`v_67` used to specify `Other`.

___

##### Suggestion: `Drop` `v_66` and  `v_67` 

In [79]:
get_var_file(67)

v_67_short_13.csv


array(['Basic for Stakeholders', 'Document previously developed Software',
       'Documents become the plan to execute',
       'They are the specification of the system that is in production/live',
       'Compliance / Quality Gates',
       'Manage change & impact, traceability, mbse', 'ISO 26262',
       'Frequently an initial understanding of a requirement is enriched and sometimes better understood and changed as we go - we just keep an eye on anything that flip-flops, as this will blow the timeframes.',
       'EN62304',
       'Many features have several requirements, they get implemented along several documenteduser stories, the user stories we write, serve also as a basis for tracking the work and commiting the code and traceability becomes very poor',
       'they are used to decide who is responsible for an issues and what budget needs to be spent in oder to solve this issue',
       "Wouldn't know", 'To final documentation'], dtype=object)

___
___

### Variable 81<a id="v_81"></a>

#### For which information do you make an explicit distinction when documenting your requirements?

Options presented in `v_68` to `v_80`:

* Architectural constraints
* Development process aspects
* Formal properties
* Functional properties of the system
* Goals
* Quality properties
* Rules (e.g., business rules)
* Stakeholdes
* System behavior
* Technical interfaces
* Usage scenarios
* User interface(s)
* Other (please specify)

`v_81` used to specify `Other`.

Situation:

We have 15 short answers, most of which show (implicitly or explictly) that the respondent did not understand the question.

___

###### Suggestion: `Drop`  `v_80` and `v_81`

In [73]:
get_var_file(81)

v_81_short_15.csv


array(['Documents produced by the system, case properties',
       "Don't quite understand the question: Of course we have a data model, consisting of requirements types and relationships, at various levels.",
       'Mixed Levels, Written with a knowledge of existing Software',
       'Everything above', 'I do not understand this question',
       'Scope, statement of purpose, glossary',
       'Depending on project type',
       'Inputs, calculations, outputs for the developer and non-functional specs for the quality of output.',
       'Business process, domain model', 'Security',
       'All of the above - depending on the requirement',
       'Also dependencies with other stories when the feature has to be developed through several teams (e.g. platform, logistics, customer service tool)',
       'Specific programming interface (IF, API), for integrated control-systems that interface our software and system.',
       'I do not produce requirement documents', 'acceptance criteria'],

___
___

### Variable 96<a id="v_96"></a>

#### How do you document requirements?

Options shown in `v_95` were:

* Activity diagrams
* Business process models
* Class diagrams
* Goal models
* Natural language / informal (plain) text
* Prototypes / User screens
* Sequence diagrams
* Sketches
* State machines
* Structured lists of requirements
* Use case diagrams
* Use cases
* User stories
* Other (please specify)

`v_96` used to specify `Other`.

___

##### Suggestion:  `Drop`  `v_96` 

In [89]:
get_var_file(96)

v_96_short_22.csv


array(['Work sketches, for billing ',
       'natural language in phrase templates',
       'Supplemental specifications, document specifications, case property specifications, Context Diagram, ERD/Conceptual Data Model, User Interface Specification',
       'Semi structured list of requirements',
       'We capture requirements on different levels in different forms.',
       "formal text (using the Sophist template for specifying RQ's in text)",
       'Data model', 'Any and all techniques and methods',
       'formal plain text as Requirements Templates (Satzschablonen)',
       'Any means to clearafy what has to be realized.',
       "Task descriptions. Look like use cases but don't specify a dialog. Specify what the users want to achive. The supplier specifies how his system supports the users. ",
       'SysML Requirements Models',
       'Sentence templates with regard to the SOPHIST Group',
       'Story mapping',
       'A short text (like User Stories) recorded in a software 

___
___

### Variable 105<a id="v_105"></a>

#### Which classes of non-functional requirements do you explicitly consider in your requirements documentation?

`v_97` to `v_103` and `v_303` mention different classes of quality requirements; `v_104` offers `Other (please specify)`.

`v_105` used for specification of `Other`.

___

##### Suggestion: `Drop`  `v_104` and `v_105` 

In [90]:
get_var_file(105)

v_105_short_11.csv


array(["We consider relevant non-functional requirements, but what's relevant can change from project to project.",
       'Weak in this area, still learning', 'Quality',
       'All above (acoording to ISO 9126/25010)', 'Data conversion',
       'legal, functional safety, ',
       'Nenhum em especial, mas qualquer um deles se destacado como especialmente importante pelo PO naquela história',
       'Nenhum',
       'Los atributos de calidad o requerimientos no funcionales se definen independientemente de los requisitos funcionales u ordinarios.',
       'No documentation', "I don't do requirement documents"],
      dtype=object)

___
___

### Variable 113<a id="v_113"></a>

#### How do you verify and / or validate your requirements?

Options offered in `v_106` to `v_112`:

* Automatic checking
* Informal peer reviews
* Inspections (formal technical reviews using reading techniques or checklists)
* Simulations
* Walkthroughs
* We do not verify and / or validate our requirements
* Other (please specify)

`v_113` used to specify `Other`.

Note: `and / or` is a very awkward 'Operator'.

___

##### Suggestion: `Code Back` then `Drop`  `v_112` and `v_113` 

In [27]:
#prepare_for_excel_people(113, 'verification_validation')

v_113_short_43.csv


In [91]:
get_var_file(113)

v_113_short_43.csv


array(['Regression/formal tests', 'close contact with testers',
       'in a feedback phase with the customers',
       'verification (known as check) and validation (know as approval) of the functional/engineering specifications acting as a requirements register',
       'Does the requirement match the visual cue',
       'informal reviews by team members (which are not in the RE-Role and have no RE-certificate but are the end users so the stake holders) and they must undertand what is ordered. Then we do workshops with the developers.',
       'Prototyping, sketches', 'Metric reviews',
       'Get feedback from stakeholders including suppliers',
       'verification & validation review by selected stakeholders',
       'according A-SPICE (Level 2)', 'desk check review made via mail',
       'Rules for attributes and sentence patterns (e.g. active, always name the actor, ...)',
       'with meetings', 'Acceptance Criteria on User Stories',
       "There's formal step where a technical

___
___

### Variable 164<a id="v_164"></a>

#### How do you align software testing with requirements?

Options shown in `v_158` to `v_163` were:

* Testers participate in requirements reviews.
* We check the coverage of requirements with tests.
* We define acceptance criteria and tests for requirements.
* We derive tests from system models.
* We do not align tests and requirements.
* Other (please specify)

`v_164` used to specify `Other`.

___

##### Suggestion: `Drop`  `v_163` and `v_164` 

In [92]:
get_var_file(164)

v_164_short_16.csv


array(['Use cases are the base for development of uat cases',
       'We need to test some electrical equipments',
       'Does testing result in the desired effect',
       'We check that each requirement is met. Each requirement has a reference to the parts of the test script that tests it.',
       'Acceptance Test Driven Development, by involving testers during creation of reqs',
       'we have completely outsourced all testing activities', 'UAT, BAT',
       'Sometimes free testing shows up variations, changes or additions to requirements.  ',
       'deriving escenarios (BDD)',
       'testing is part of the development, not a separate step.',
       'WHILE THE SCRUM is performed, testing is performed',
       "We're early in a lean startup mode, so we're just prototyping to test different hypothesis with customers",
       'The QA team is not co-located, they mainly do functional manual testing according to the agreed testing scenarios and coverage with the PO, however many tec

___
___

### Variable 166<a id="v_166"></a>

#### How do you deal with changing requirements after the initial release of the system (or parts of it)?

Options shown in `v_165` (select one):
* We work with change requests, but do not further update our requirements specification once formally accepted.
* We update our product backlog.
* We work with change requests and continuously update our requirements specification accordingly also after formally accepted.
* Other (please specify)
* We don't update our requirements (documentation) at all.

`v_166` used to specify `Other`.

___

##### Suggestion:  `Drop`  `v_166` 

In [93]:
get_var_file(166)

v_166_short_18.csv


array(['We update the requirements, but not the official requirements documentation.',
       'We perform changes on the system description in each iteration. Tool support is key here (baselining and baseline compare, suspect links, workflow states reflecting chanages, etc.)',
       'Change request for change in scope',
       "I don't implement until all requirements are finalized, hybrid waterfall, incremental, and iterative build ",
       'we upate until officially handed over to external development. After that we either create new requirements or add notes to exsiting ones, clarifying them.',
       'for software, we use SCRUM Teams ',
       'Changes requests for Waterfall projects - ideally in a Requirements Management system like BluePrint - update the Product Backlog for Agile projects',
       'change management addenda are made but the previously accepted specification is unchanged',
       'The idea is to use change requests, but there is no concept yet.',
       'I have 

___
___

### Variable 194<a id="v_194"></a>

#### Besides the problems listed in the previous questions, is there another prominent problem you experienced in your project? If so, which one?

Problems mentioned in previous questions:

* Communication flaws within the project team 
* Communication flaws between the project and the customer 
* Terminological problems
* Incomplete or hidden requirements
* Insufficient support by project lead
* Insufficient support by customer
* Stakeholders with difficulties in separating requirements from solution designs
* Inconsistent requirements
* Missing traceability
* Moving targets (changing goals, business processes and / or requirements)
* Gold plating (implementation of features without corresponding
* Weak access to customer needs and / or (internal) business information
* Weak knowledge about customer's application domain
* Weak relationship between customer and project lead
* Time boxing / Not enough time in general
* Discrepancy between high degree of innovation and need for formal requirements acceptance of (potentially wrong / incomplete / unknown) requirements
* Technically unfeasible requirements
* Underspecified requirements that are too abstract and allow for various interpretations
* Unclear / unmeasurable non-functional requirements
* Volatile customer's business domain regarding, e.g., changing points of contact, business processes or requirements

___

##### Suggestion: `Special Treatment` - group additional problems mentioned, label, and count them - then report results separately (no integration into (copy of) df) and use to improve future survey - then `Drop` `v_194`

In [28]:
#prepare_for_excel_people(194, 'other_reproblem')

v_194_short_191.csv


In [97]:
get_var_file(194)

v_194_short_191.csv


array(['Nothing', ' ', 'No',
       'General dissatisfaction in customers because the system is not the same as the old one.',
       'politics getting in the way of engineering', 'None',
       'Lack of support from management to use agile processes. While management says that they understand it and are in agreement, in practice they still hold onto the traditional ways of thinking, and expect the outcomes of agile but with waterfall practices. Also, some regulated industries insert an extra layer of requirements that serve a purpose, but are also very constraining.',
       'Requriments engineering have a bad rep in agile development. It is seen as constricting.  lets just start and see how we do  seems to be the best practice in scrum.',
       'Ownership of project and project goal on business side is missing.',
       'Incompetent vendor project team',
       'missing willingness to understand the customers needs by project people. Sometimes they behave like a proxy between custom

The End.