Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom ranking configuration for projects in algolia #302

Merged
merged 8 commits into from Mar 19, 2019

Conversation

@aquibbaig
Copy link
Collaborator

aquibbaig commented Feb 9, 2019

This is an initial PR for setting the custom rankings for the projects in Algolia. Please review it and suggest me changes. The metrics are set while creation of a new project. This is for the issue #276

Copy link
Collaborator

sandipbhuyan left a comment

Resolve the PHPDocs mistakes. Otherwise LGTM

@@ -221,6 +221,20 @@ public function viewAction(Request $request, $parentName, $projectName)
'isHostedOnGithub' => GithubRepoCrawler::isGithubRepoUrl($p->getSourceRepo()->getUrl()),
];
//Retrieve the project and set the code quality metrics

This comment has been minimized.

Copy link
@sandipbhuyan

sandipbhuyan Feb 15, 2019

Collaborator

Add space after the starting of the comment section // Retrieve

}
/**
* Get the code quality metrics for the project

This comment has been minimized.

Copy link
@sandipbhuyan

sandipbhuyan Feb 15, 2019

Collaborator

New line after description

@aquibbaig

This comment has been minimized.

Copy link
Collaborator Author

aquibbaig commented Feb 15, 2019

@sandipbhuyan resolved 👍

Copy link
Collaborator

sandipbhuyan left a comment

LGTM

Copy link
Contributor

imphil left a comment

Thanks for your contribution @aquibbaig! I've looked through the PR and I've made a couple comments in the source code. Please have a look at them.

* Code quality metrics for the project
* @var float
*
* @ORM\Column(type="float", nullable=true)

This comment has been minimized.

Copy link
@imphil

imphil Feb 19, 2019

Contributor

What's the motivation for storing this as float? Isn't a (possibly scaled, e.g. multiplied by 10) integer sufficient?

This comment has been minimized.

Copy link
@aquibbaig

aquibbaig Feb 28, 2019

Author Collaborator

@imphil The function getCodeQualityScore that calculates the project quality metrics returns a float value according to its definition

This comment has been minimized.

Copy link
@imphil

imphil Mar 14, 2019

Contributor

Floats are really tricky to get right in comparisons (and hence sorting), integers are much easier to handle. How about multiplying the metrics value by 100 and storing that as integer value, effectively limiting the precision of the float to two digits.

*
* @ORM\Column(type="float", nullable=true)
*/
private $projectMetrics;

This comment has been minimized.

Copy link
@imphil

imphil Feb 19, 2019

Contributor

The name projectMetrics indicates multiple items (plural). That's a simple number you want to store here? How about score? (the project prefix is already in the name of the table)

@@ -221,6 +221,20 @@ public function viewAction(Request $request, $parentName, $projectName)
'isHostedOnGithub' => GithubRepoCrawler::isGithubRepoUrl($p->getSourceRepo()->getUrl()),
];
// Retrieve the project and set the code quality metrics

This comment has been minimized.

Copy link
@imphil

imphil Feb 19, 2019

Contributor

I didn't really follow the full discussion, but why is there a database update in the view controller? That looks wrong.

This comment has been minimized.

Copy link
@agathver

agathver Feb 20, 2019

Collaborator

My suggestion is to move it to GitRepoCrawler, we can persist it there itself.

@aquibbaig

This comment has been minimized.

Copy link
Collaborator Author

aquibbaig commented Feb 20, 2019

@imphil The data which is pushed into algolia is normalised only when we make a function in the entity. However it is not advisable to use services inside the entity itself since the project is already not created. So we cannot access the CommitRepository file inside the Project entity, which is the requirement for the project metrics to be calculated.
And the custom ranking stars are subjected to change as per the changes in the code. This data is updated when we display the project itself by fetching the metadata. So, I hoped that if we run a database migration and set the metrics as and when it is fetched, we can always update the database when they are updated further too. And we custom rank it in algolia accordingly as we call the getter in the entity itself.

Anyways, we can also have a way around if you can suggest one :)

@@ -221,6 +221,20 @@ public function viewAction(Request $request, $parentName, $projectName)
'isHostedOnGithub' => GithubRepoCrawler::isGithubRepoUrl($p->getSourceRepo()->getUrl()),
];
// Retrieve the project and set the code quality metrics

This comment has been minimized.

Copy link
@agathver

agathver Feb 20, 2019

Collaborator

My suggestion is to move it to GitRepoCrawler, we can persist it there itself.

@agathver

This comment has been minimized.

Copy link
Collaborator

agathver commented Feb 20, 2019

@aquibbaig We can always add manually re-trigger a crawl of all registered projects. No need to fire a crawl on initial read. We anyway want to do a periodic crawl through a cronjob.

@aquibbaig

This comment has been minimized.

Copy link
Collaborator Author

aquibbaig commented Feb 28, 2019

@agathver @imphil I tried using the project metrics provider service but the service uses instances of commitRepository and contributorRepository, hence I had to move the logic totally to GitRepoCrawler, where the data for the project metrics is crawled. Locally setting up the environment for me also crawls the scores for other projects too. This PR is for #276 . Please review it.

Copy link
Collaborator

agathver left a comment

LGTM

Copy link
Contributor

imphil left a comment

Thanks for your update to this PR @aquibbaig! I've a couple comments mostly on code duplication, can you have a look at those? Thanks!

* Code quality metrics for the project
* @var float
*
* @ORM\Column(type="float", nullable=true)

This comment has been minimized.

Copy link
@imphil

imphil Mar 14, 2019

Contributor

Floats are really tricky to get right in comparisons (and hence sorting), integers are much easier to handle. How about multiplying the metrics value by 100 and storing that as integer value, effectively limiting the precision of the float to two digits.

*
* @ORM\Column(type="float", nullable=true)
*/
private $score;

This comment has been minimized.

Copy link
@imphil

imphil Mar 14, 2019

Contributor

could you change that attribute to qualityScore to make it a bit more descriptive?

$this->manager->persist($project);
return true;
}
/**

This comment has been minimized.

Copy link
@imphil

imphil Mar 14, 2019

Contributor

All of this code is copied from ProjectMetricsProvider.php. Why don't you inject that class as service into the GitRepoCrawler and use it here?

This comment has been minimized.

Copy link
@aquibbaig

aquibbaig Mar 15, 2019

Author Collaborator

@imphil It took some time to figure out inheritance rules among the classes. I guess this fixes #276. If there is something else I can fix, please let me know.

Copy link
Contributor

imphil left a comment

Can't you just inject the ProjectMetricsProvider service into the crawler directly, as opposed to injecting it into the UpdateProjectInformation class and passing it on from there?

/**
* Set code quality metrics for the project
*
* @param float $qualityScore

This comment has been minimized.

Copy link
@imphil

imphil Mar 15, 2019

Contributor

it's int now, isn't it?

@aquibbaig

This comment has been minimized.

Copy link
Collaborator Author

aquibbaig commented Mar 15, 2019

@imphil I had overlooked that float typo. Is this okay now? :D

Copy link
Contributor

imphil left a comment

I think I have one more chance request regarding where to place the metrics class, but beyond that it looks good now!

@@ -88,6 +98,14 @@ class GitRepoCrawler extends RepoCrawler
private $repoClonePath = null;
protected $projectMetricsProvider;
public function __construct(SourceRepo $repo, MarkupToHtmlConverter $markupConverter, ProcessCreator $processCreator, ObjectManager $manager, LoggerInterface $logger, ProjectMetricsProvider $projectMetricsProvider)

This comment has been minimized.

Copy link
@imphil

imphil Mar 15, 2019

Contributor
  • Style: Please split this line into two (at least).
  • What makes the project metrics git-specific to be included only in the GitRepoCrawler, and not in the more general RepoCrawler? It looks to me that it should go there.
use Librecores\ProjectRepoBundle\Entity\Project;
use Librecores\ProjectRepoBundle\Util\StatsUtil;
use Librecores\ProjectRepoBundle\Util\Dates;
use Librecores\ProjectRepoBundle\Doctrine\ProjectMetricsProvider;

This comment has been minimized.

Copy link
@imphil

imphil Mar 15, 2019

Contributor

I don't see code that makes use of all those added classes. Why are they necessary?

This comment has been minimized.

Copy link
@aquibbaig

aquibbaig Mar 19, 2019

Author Collaborator

For the constructor

@@ -77,6 +78,7 @@ public function __construct(
* Update the project associated with the crawled repository with
* information extracted from the repo
*
* @param ProjectMetricsProvider $projectMetricsProvider

This comment has been minimized.

Copy link
@imphil

imphil Mar 15, 2019

Contributor

that parameter doesn't exist any more.

@@ -3,6 +3,7 @@
namespace Librecores\ProjectRepoBundle\RepoCrawler;
use Doctrine\Common\Persistence\ObjectManager;
use Librecores\ProjectRepoBundle\Doctrine\ProjectMetricsProvider;

This comment has been minimized.

Copy link
@imphil

imphil Mar 15, 2019

Contributor

That's unnecessary as well it seems.

Copy link
Contributor

imphil left a comment

Thanks @aquibbaig for your update on this PR. I've added a couple of comments in the code.

ProcessCreator $processCreator,
ObjectManager $manager,
LoggerInterface $logger,
$projectMetricsProvider

This comment has been minimized.

Copy link
@imphil

imphil Mar 19, 2019

Contributor

Is there a reason for this parameter not having a type annotation?

This comment has been minimized.

Copy link
@aquibbaig

aquibbaig Mar 19, 2019

Author Collaborator

Both of the approaches will work because projectmetrics are initialized from the RepoCrawler itself thru the constructor in GitRepoCrawler. So, I removed the effort of requiring the ProjectMetricsProvider class always and injecting it. However the type should be there. So I'm making the changes

This comment has been minimized.

Copy link
@agathver

agathver Mar 19, 2019

Collaborator

It's better to put the type annotation there

public function getProjectMetricsProvider()
{
return $this->projectMetricsProvider;
}

This comment has been minimized.

Copy link
@imphil

imphil Mar 19, 2019

Contributor

Why is that function needed? Within the subclasses you can access the metrics provider through $this->projectMetricsProvider directly.

$this->abortIf($this->connection->getDatabasePlatform()->getName() !== 'mysql', 'Migration can only be executed safely on \'mysql\'.');
$this->addSql('ALTER TABLE Project ADD qualityScore INT DEFAULT NULL');
$this->addSql('DROP INDEX classification_idx ON ProjectClassification');

This comment has been minimized.

Copy link
@imphil

imphil Mar 19, 2019

Contributor

I wonder what change caused this index to be dropped?

use Symfony\Component\Filesystem\Filesystem;
use Symfony\Component\Filesystem\Exception\IOExceptionInterface;
use Symfony\Component\Process\Process;

This comment has been minimized.

Copy link
@imphil

imphil Mar 19, 2019

Contributor

Why all those changes? Were they wrong before? I don't see changes in the code that would make those necessary.

This comment has been minimized.

Copy link
@aquibbaig

aquibbaig Mar 19, 2019

Author Collaborator

Some classes are necessary because I have created a constructor to inject projectmetrics along with all other classes, specifically in GitRepoCrawler as there was no constructor here. I am sanitising the file by removing extra classes with the next commit.

@aquibbaig

This comment has been minimized.

Copy link
Collaborator Author

aquibbaig commented Mar 19, 2019

Thanks, I'll update it today

ProcessCreator $processCreator,
ObjectManager $manager,
LoggerInterface $logger,
$projectMetricsProvider

This comment has been minimized.

Copy link
@agathver

agathver Mar 19, 2019

Collaborator

It's better to put the type annotation there

@@ -53,9 +53,10 @@ public function __construct(
ProcessCreator $processCreator,
ObjectManager $manager,
LoggerInterface $logger,
GithubApiService $ghApi
GithubApiService $ghApi,
$projectMetricsProvider

This comment has been minimized.

Copy link
@agathver

agathver Mar 19, 2019

Collaborator

Add type annotation here

@imphil imphil merged commit 5a77ddc into librecores:master Mar 19, 2019
@imphil

This comment has been minimized.

Copy link
Contributor

imphil commented Mar 19, 2019

Thanks @aquibbaig and congratulations to your first contribution! This is now merged and will be deployed with the next push to production.

@aquibbaig

This comment has been minimized.

Copy link
Collaborator Author

aquibbaig commented Mar 21, 2019

Thanks @imphil

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.