Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Maybe) Incorrect cost in COCOMO calculations #281

Closed
Theldus opened this issue Aug 19, 2021 · 6 comments
Closed

(Maybe) Incorrect cost in COCOMO calculations #281

Theldus opened this issue Aug 19, 2021 · 6 comments
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects

Comments

@Theldus
Copy link

Theldus commented Aug 19, 2021

Description & steps to reproduce

Hi,
I've been testing scc on a project of mine (PBD) and the estimated cost just seems to be way higher than expected:

$ scc --version
scc version 3.0.0

$ git clone https://github.com/Theldus/PBD
Cloning into 'PBD'...
remote: Enumerating objects: 845, done.
remote: Counting objects: 100% (493/493), done.
remote: Compressing objects: 100% (304/304), done.
remote: Total 845 (delta 327), reused 318 (delta 164), pack-reused 352
Receiving objects: 100% (845/845), 248.96 KiB | 0 bytes/s, done.
Resolving deltas: 100% (590/590), done.

$ scc PBD/
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
C                           16      7856      927      2645     4284        796
C Header                    15      1543      186       622      735        113
Makefile                     3       303       42       115      146          2
Assembly                     2       502       38         0      464          8
Shell                        2       252       31        79      142         12
License                      1        19        3         0       16          0
Markdown                     1       223       44         0      179          0
R                            1        98       11        35       52          0
YAML                         1        49        4         4       41          0
gitignore                    1        33        1        22       10          0
───────────────────────────────────────────────────────────────────────────────
Total                       43     10878     1287      3522     6069        931
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop (organic) $179,418
Estimated Schedule Effort (organic) 7.159522 months
Estimated People Required (organic) 2.226385
───────────────────────────────────────────────────────────────────────────────
Processed 292631 bytes, 0.293 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────

I may be completely equivocated (and I apologize in advance), but 179k$ for a 6096 SLOC project, 2 developers and 7 months, seems far above the average developer salary.

Initial investigation

Although I'm not a Go programmer, below is my attempt to understand better the issue.

The organic COCOMO calculation code (cocomo.go) can be quickly implemented with the following Python code:

def estimate_cost(effort, wage):
    return effort * float(wage/12) * 2.4

def estimate_effort(sloc):
    return 2.4 * pow(float(sloc)/1000, 1.05) * 1

def estimate_schedule_months(effort):
    return 2.5 * pow(effort, 0.38)

and the project cost showed above could be calculated as (using the same avg wage found on processor.go):

>>> effort=estimate_effort(6069)
>>> cost=estimate_cost(effort, 56286)
>>> months=estimate_schedule_months(effort)
>>> people=effort/months
>>>
>>> effort
15.939850215524215
>>> cost
179418.95402594056
>>> months
7.159521842460346
>>> people
2.2263847455553734

that matches exactly with the one reported by scc.

However, the EstimateCost method only takes effort and wage into account, and effort does not represent the number of people in the project nor the estimated time in months.

Maybe I got it wrong, but in my head the correct way to generate the cost would be:

  • Calculate effort from SLOC (E).
  • Calculate the estimated time in months from the effort. (D).
  • Calculate the number of people needed from effort and time. (P).
  • Calculate the cost, based on average salary, time (D) and people (P), i.e: cost = (avg_wage/12) * D * P.

Thus, the PBD's real cost would be:

>>> true_cost=(56286/12) * months * people
>>> true_cost
74757.89751080857

Desktop (please complete the following information):

  • OS: Slackware Linux (x86_64)
  • Version: 14.2
@boyter
Copy link
Owner

boyter commented Aug 19, 2021

So... I based the calculations on what COCOMO said, and you might be right on it. However I also compared it to what sloccount produced, given commit 18ff8cd1fbe0d4c4bc6d090ef58a513ecb9767da for redis https://github.com/redis/redis I get the following,

Total Physical Source Lines of Code (SLOC)                = 180,565
Development Effort Estimate, Person-Years (Person-Months) = 46.83 (561.92)
 (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
Schedule Estimate, Years (Months)                         = 2.31 (27.72)
 (Basic COCOMO model, Months = 2.5 * (person-months**0.38))
Estimated Average Number of Developers (Effort/Schedule)  = 20.27
Total Estimated Cost to Develop                           = $ 6,325,675
 (average salary = $56,286/year, overhead = 2.40).

and from searchcode,

Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
C                          296    180267    20460     31706   128101      32540
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop (organic) $6,896,435
Estimated Schedule Effort (organic) 28.647811 months
Estimated People Required (organic) 21.386966

The cost is close, although the effort is not, although number of developers is.

Need some more investigation. I copied the COCOMO from a Python implementation back when I wrote it.

Keep in mind COCOMO is meant to account for the cost of a building + electricity and everything else that goes into writing the code as well. Which might explain it being too high?

You can also tweak all of the values if you need, although that might be moot if I got the implementation wrong.

@boyter boyter added the help wanted Extra attention is needed label Aug 19, 2021
@Theldus
Copy link
Author

Theldus commented Aug 19, 2021

I see... I also checked the SLOCCount source code and your calculations are really an adaptation of it.

Keep in mind COCOMO is meant to account for the cost of a building + electricity and everything else that goes into writing the code as well. Which might explain it being too high?

Indeed, but the calculated cost still seems quite high to me. In fact, there is something interesting in the SLOCCount source code, in the get_sloc file, there's a snippet with:

# Overhead; the person cost is multiplied by this value to determine
# true annual costs.

$overhead = 2.4;

This 'overhead' variable seems to be exactly what you suspected, and it is actually used in your code too, but with no indication of what it would be. Without it, the results are exactly the same, as expected.

Maybe it could be interesting, if just as there is the parameter '--avg-wage', there could also be a "--overhead", which makes it configurable, since different scenarios can lead to different costs, either cheaper or more expensive.

Anyway, feel free to close the issue if necessary and thank you very much for the feedback =).

@boyter boyter added the enhancement New feature or request label Aug 20, 2021
@boyter
Copy link
Owner

boyter commented Aug 20, 2021

Adding overhead as an additional value to tweak seems sensible. The whole COCOMO model is configurable at present anyway so adding that makes sense.

Not going to close, ill add that as an additional value you can play with.

@fschaefer
Copy link
Contributor

I'm just investigating scc and COCOMO for cost estimation and stumbled upon this issue. So here are just two comments on what caught my eye here (but maybe I'm totally wrong here).

@Theldus:

I may be completely equivocated (and I apologize in advance), but 179k$ for a 6096 SLOC project, 2 developers and 7 months, seems far above the average developer salary.

Quoting from SLOCCount User's Guide:

You may be surprised by the high cost estimates, but remember, these include design, coding, testing, documentation (both for users and for programmers), and a wrap rate for corporate overhead (to cover facilities, equipment, accounting, and so on). Many programmers forget these other costs and are shocked by the high figures. If you only wanted to know the costs of the coding, you'd need to get those figures.

So you're absolutely right: the Estimated Cost to Develop is far more than an average developer salary and is massively increased through the overhead factor of 2.4 you mentioned. A factor of >2 seem very plausible to me considering building costs, property taxes, utilities, equipment, insurance, benefits, pensions, social security, medicare etc. I've added a flag for tuning the overhead in scc and opened a pull request.

@boyter:

[comparing the output for redis]
The cost is close, although the effort is not, although number of developers is.

If I get this right, scc seems pretty close to SLOCCount in all values:
6,896,435 28.647811 21.386966 vs. 6,325,675 27.72 20.27

boyter added a commit that referenced this issue Nov 1, 2021
Add a overhead commandline flag to set the multiplier for corporate overhead. #281
@boyter
Copy link
Owner

boyter commented Nov 1, 2021

That looks correct to me, and is in line with expectations of my understanding of COCOMO and how it is derived and works.

So all looks good, and the new code by @fschaefer allows this to be configured even further now. I think thats probably good enough. Especially, with a combination of --overhead --cocomo-project-type and --avg-wage.

Going to leave this open but will close on the next release once I get into really doing it.

@boyter boyter moved this from Todo to Done in 3.1.0 Dec 14, 2021
@boyter
Copy link
Owner

boyter commented Dec 14, 2021

Latest updates for 3.1.0 release should resolve all of the above.

@boyter boyter closed this as completed Dec 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
No open projects
3.1.0
Done
Development

No branches or pull requests

3 participants