Context: A customer wants to get their source code under control. With this analysis, we analyze existing concepts in the source code based on naming conventions. The goals is to find common used naming conventions and document them in the architecture documentation so that every developer can understand those concept if they come across those in the source code.

In [1]:
import glob
path = "../../spring-framework/"
java_filelist = glob.glob(path + "**/*.java", recursive=True)
java_filelist[:5]

['../../spring-framework/buildSrc/src/main/java/org/springframework/build/api/ApiDiffPlugin.java',
 '../../spring-framework/buildSrc/src/main/java/org/springframework/build/compile/CompilerConventionsPlugin.java',
 '../../spring-framework/buildSrc/src/main/java/org/springframework/build/hint/RuntimeHintsAgentExtension.java',
 '../../spring-framework/buildSrc/src/main/java/org/springframework/build/hint/RuntimeHintsAgentPlugin.java',
 '../../spring-framework/buildSrc/src/main/java/org/springframework/build/optional/OptionalDependenciesPlugin.java']

In [2]:
import pandas as pd

code = pd.DataFrame(java_filelist, columns=["filepath"])
code["filepath"] = code["filepath"].str.replace(path, "", regex=False)
code = code[~code["filepath"].str.endswith("package-info.java")].copy()
code.head()

Unnamed: 0,filepath
0,buildSrc/src/main/java/org/springframework/bui...
1,buildSrc/src/main/java/org/springframework/bui...
2,buildSrc/src/main/java/org/springframework/bui...
3,buildSrc/src/main/java/org/springframework/bui...
4,buildSrc/src/main/java/org/springframework/bui...


In [3]:
code["type"] = code['filepath'].str.rsplit("/", 1).str[-1].str.replace(".java","", regex=False)
code.head()

Unnamed: 0,filepath,type
0,buildSrc/src/main/java/org/springframework/bui...,ApiDiffPlugin
1,buildSrc/src/main/java/org/springframework/bui...,CompilerConventionsPlugin
2,buildSrc/src/main/java/org/springframework/bui...,RuntimeHintsAgentExtension
3,buildSrc/src/main/java/org/springframework/bui...,RuntimeHintsAgentPlugin
4,buildSrc/src/main/java/org/springframework/bui...,OptionalDependenciesPlugin


In [4]:
import re
 
def split_camel_case_split(str):
    return re.findall(r'[A-Z](?:[a-z]+|[A-Z]*(?=[A-Z]|$))', str)

code["splitted"] = code["type"].apply(split_camel_case_split)
code.head()

Unnamed: 0,filepath,type,splitted
0,buildSrc/src/main/java/org/springframework/bui...,ApiDiffPlugin,"[Api, Diff, Plugin]"
1,buildSrc/src/main/java/org/springframework/bui...,CompilerConventionsPlugin,"[Compiler, Conventions, Plugin]"
2,buildSrc/src/main/java/org/springframework/bui...,RuntimeHintsAgentExtension,"[Runtime, Hints, Agent, Extension]"
3,buildSrc/src/main/java/org/springframework/bui...,RuntimeHintsAgentPlugin,"[Runtime, Hints, Agent, Plugin]"
4,buildSrc/src/main/java/org/springframework/bui...,OptionalDependenciesPlugin,"[Optional, Dependencies, Plugin]"


In [5]:
code["name_-1"] = code['splitted'].str[-1].fillna("")
code["name_-2"] = code['splitted'].str[-2].fillna("")
code["name_-3"] = code['splitted'].str[-3].fillna("")
code["name_-2_-1"] = code["name_-2"] + code["name_-1"]
code["name_-3_-2_-1"] = code["name_-3"] + code["name_-2"] + code["name_-1"]
code.iloc[:,-5:].head()

Unnamed: 0,name_-1,name_-2,name_-3,name_-2_-1,name_-3_-2_-1
0,Plugin,Diff,Api,DiffPlugin,ApiDiffPlugin
1,Plugin,Conventions,Compiler,ConventionsPlugin,CompilerConventionsPlugin
2,Extension,Agent,Hints,AgentExtension,HintsAgentExtension
3,Plugin,Agent,Hints,AgentPlugin,HintsAgentPlugin
4,Plugin,Dependencies,Optional,DependenciesPlugin,OptionalDependenciesPlugin


In [6]:
pd.DataFrame(code['name_-1'].value_counts()).head()

Unnamed: 0,name_-1
Tests,2294
Exception,260
Resolver,225
Bean,166
Factory,153


In [7]:
pd.DataFrame(code['name_-2_-1'].value_counts()).head()

Unnamed: 0,name_-2_-1
IntegrationTests,124
ResolverTests,119
ArgumentResolver,84
HandlerTests,76
FactoryBean,71


Taking level -3 into consideration, makes it clear that this might not be the best choice because those stereotypes consist partly of domain names. Thus, level -2 seems to be a good candidate to analyze the corresponding stereotypes a little bit more in detail.

In [8]:
pd.DataFrame(code['name_-3_-2_-1'].value_counts()).head()

Unnamed: 0,name_-3_-2_-1
MethodArgumentResolver,68
ArgumentResolverTests,61
BeanDefinitionParser,46
FactoryBeanTests,34
MessageConverterTests,34


Getting a list of source code files that resemble one concept for level -1.

In [9]:
code_stereotype_per_file = code.groupby(['name_-1', 'filepath'])[['type']].count()
code_stereotype_per_file.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,type
name_-1,filepath,Unnamed: 2_level_1
A,spring-expression/src/test/java/org/springframework/expression/spel/spr10210/A.java,1
Access,spring-aop/src/main/java/org/springframework/aop/RawTargetAccess.java,1
Access,spring-core-test/src/main/java/org/springframework/aot/test/generator/compile/CompileWithTargetClassAccess.java,1
Accessor,spring-beans/src/main/java/org/springframework/beans/AbstractNestablePropertyAccessor.java,1
Accessor,spring-beans/src/main/java/org/springframework/beans/AbstractPropertyAccessor.java,1


In [10]:
code_stereotypes = code_stereotype_per_file.groupby(['name_-1']).transform(sum).sort_values(by="type", ascending=False)
code_stereotypes

Unnamed: 0_level_0,Unnamed: 1_level_0,type
name_-1,filepath,Unnamed: 2_level_1
Tests,spring-test/src/test/java/org/springframework/test/context/junit4/spr8849/Spr8849Tests.java,2294
Tests,spring-test/src/test/java/org/springframework/test/context/configuration/ContextConfigurationWithPropertiesExtendingPropertiesAndInheritedLoaderTests.java,2294
Tests,spring-test/src/test/java/org/springframework/test/context/cache/MethodLevelDirtiesContextTests.java,2294
Tests,spring-test/src/test/java/org/springframework/test/context/cache/LruContextCacheTests.java,2294
Tests,spring-test/src/test/java/org/springframework/test/context/cache/ContextCacheUtilsTests.java,2294
...,...,...
Enum,spring-beans/src/testFixtures/java/org/springframework/beans/testfixture/beans/CustomEnum.java,1
Evict,spring-context/src/main/java/org/springframework/cache/annotation/CacheEvict.java,1
Expander,spring-r2dbc/src/main/java/org/springframework/r2dbc/core/NamedParameterExpander.java,1
Export,spring-context/src/main/java/org/springframework/context/annotation/EnableMBeanExport.java,1


In [11]:
code_stereotypes.to_excel("output/concept_stereotypes_-1.xlsx")

The same for level -2.

In [12]:
code_stereotype_per_file_2_1 = code.groupby(['name_-2_-1', 'filepath'])[['type']].count()
code_stereotype_per_file_2_1.head(20)

Unnamed: 0_level_0,Unnamed: 1_level_0,type
name_-2_-1,filepath,Unnamed: 2_level_1
A,spring-expression/src/test/java/org/springframework/expression/spel/spr10210/A.java,1
ACATester,spring-context/src/testFixtures/java/org/springframework/context/testfixture/beans/ACATester.java,1
AbstractClass,spring-context/src/test/java/example/profilescan/SomeAbstractClass.java,1
AbstractController,spring-context-indexer/src/test/java/org/springframework/context/index/sample/AbstractController.java,1
AbstractController,spring-webmvc/src/main/java/org/springframework/web/servlet/mvc/AbstractController.java,1
AbstractDecoder,spring-core/src/main/java/org/springframework/core/codec/AbstractDecoder.java,1
AbstractEncoder,spring-core/src/main/java/org/springframework/core/codec/AbstractEncoder.java,1
AbstractEnvironment,spring-core/src/main/java/org/springframework/core/env/AbstractEnvironment.java,1
AbstractErrors,spring-context/src/main/java/org/springframework/validation/AbstractErrors.java,1
AbstractException,spring-beans/src/main/java/org/springframework/beans/factory/BeanIsAbstractException.java,1


In [13]:
code_stereotypes_2_1 = code_stereotype_per_file_2_1 \
    .groupby(['name_-2_-1']) \
    .transform(sum) \
    .sort_values(by=["type", "name_-2_-1", "filepath"], ascending=False) \
    .reset_index()
code_stereotypes_2_1.head(20)

Unnamed: 0,name_-2_-1,filepath,type
0,IntegrationTests,spring-websocket/src/test/java/org/springframe...,124
1,IntegrationTests,spring-websocket/src/test/java/org/springframe...,124
2,IntegrationTests,spring-websocket/src/test/java/org/springframe...,124
3,IntegrationTests,spring-websocket/src/test/java/org/springframe...,124
4,IntegrationTests,spring-websocket/src/test/java/org/springframe...,124
5,IntegrationTests,spring-websocket/src/test/java/org/springframe...,124
6,IntegrationTests,spring-websocket/src/test/java/org/springframe...,124
7,IntegrationTests,spring-webmvc/src/test/java/org/springframewor...,124
8,IntegrationTests,spring-webmvc/src/test/java/org/springframewor...,124
9,IntegrationTests,spring-webmvc/src/test/java/org/springframewor...,124


In [14]:
code_stereotypes_2_1.to_excel("output/concept_stereotypes_-2_-1.xlsx", index=None)