-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create benchmark #167
Comments
One of the main reasons for Yampa to have changed very little, and for so many PRs to take so long to integrate, is trying to make sure that we do not seriously impact performance or break the API when we do that. This commit implements a very, very basic benchmark that is integrated in the cabal file.
I am interested in this, but have several concerns I feel should be clarified before proceeding further:
|
As a general approach, not just to this task, but to all, I would go step by step. The reality is that we are all really busy, and it pays in the long run to strategize this as a series of ridiculously small steps in the right direction, trying to complete each step before going into the next one. A small step will give us the ability to complete things and put them completely out of our minds, and a sense of accomplishment for each task completed. So, first, let's create a very small benchmark for Yampa. Then, let's see what we can say about different Yampa constructs. Then, how we could automate that analysis (if we can), or at least facilitate it. And so on. It'll be long, but it'll be attainable. If we try to aim too high from the beginning, will likely never finish and, even if we do, we will come up with the ultimate, perfect FRP benchmark that nobody will ever complete. (And for what is worth, we have tried this at least twice before with Yampa and twice with dunai, and that's precisely what happened in all cases.) |
One of the main reasons for Yampa to have changed very little, and for so many PRs to take so long to integrate, is trying to make sure that we do not seriously impact performance or break the API when we do that. This commit implements a very, very basic benchmark that is integrated in the cabal file.
I know this has been a long standing issue, so I'm just writing so people know there's progress going on. I currently have a small benchmark, together with a way of comparing it to the result of the prior benchmark. Hopefully, this will give us a measure of how much better/worse a change is, and help us start making decisions based on real data. We can later improve the quality of those decisions by improving the benchmarks in terms of coverage, depth, and their reliability. |
One of the main reasons for Yampa to have changed very little, and for so many PRs to take so long to integrate, is trying to make sure that we do not seriously impact performance or break the API when we do that. This commit implements a very, very basic benchmark that is integrated in the cabal file. [ci skip]
One of the main reasons for Yampa to have changed very little, and for so many PRs to take so long to integrate, is trying to make sure that we do not seriously impact performance or break the API when we do that. This commit implements a very, very basic benchmark that is integrated in the cabal file. [ci skip]
One of the main reasons for Yampa to have changed very little, and for so many PRs to take so long to integrate, is trying to make sure that we do not seriously impact performance or break the API when we do that. This commit implements a very, very basic benchmark that is integrated in the cabal file. [ci skip]
One of the main reasons for Yampa to have changed very little, and for so many PRs to take so long to integrate, is trying to make sure that we do not seriously impact performance or break the API when we do that. This commit implements a very, very basic benchmark that is integrated in the cabal file. [ci skip]
One of the main reasons for Yampa to have changed very little, and for so many PRs to take so long to integrate, is trying to make sure that we do not seriously impact performance or break the API when we do that. This commit implements a very, very basic benchmark that is integrated in the cabal file. [ci skip]
This commit implements a basic benchmark that is integrated in the Yampa package. The goal is to have an initial benchmark infrastructure that can be used to evaluate the performance of Yampa, and compare between different versions, and to make evidence-based decisions about when new features should be incorporated into Yampa. [ci skip]
I have just put a branch with an initial benchmark: https://github.com/ivanperez-keera/Yampa/tree/develop-performance It has two commits: the first one is for the release, the second one is just for documentation but not to be included into the official release (at least, not as is). My intention is to merge the first as part of Yampa 0.14.4, which will be published 2 days from now. This change does not address everything we've discussed so far. It is one benchmark to have some infrastructure in place, and to get things rolling. I want benchmarks to become part of the normal process of releasing Yampa and evaluating new contributions. To me, it is VERY important that, whatever mechanism we put in place to create a benchmark, it be automated. If it takes a lot of steps to perform benchmarking, we will not use it continuously, or ever. Please discuss. If there's an immediate lack that can be addressed quickly, please comment. If you see a problem with the current version, also please comment. Performance evaluation and benchmarking will be a big enough effort that we will not be able to get it completely right the first time. Perfect is the enemy of good. Tagging @RiugaBachi @turion for awareness. Ideas from this proposal may also be used with bearriver and/or dunai. |
This commit implements a basic benchmark that is integrated in the Yampa package. The goal is to have an initial benchmark infrastructure that can be used to evaluate the performance of Yampa, and compare between different versions, and to make evidence-based decisions about when new features should be incorporated into Yampa. [ci skip]
This commit implements a basic benchmark that is integrated in the Yampa package. The goal is to have an initial benchmark infrastructure that can be used to evaluate the performance of Yampa, and compare between different versions, and to make evidence-based decisions about when new features should be incorporated into Yampa.
Benchmarks are useless unless people actually use them to evaluate their proposed solutions. This commit modifies the README to document the existence of benchmarks already as part of the Yampa package.
This commit implements a basic benchmark that is integrated in the Yampa package. The goal is to have an initial benchmark infrastructure that can be used to evaluate the performance of Yampa, and compare between different versions, and to make evidence-based decisions about when new features should be incorporated into Yampa.
Benchmarks are useless unless people actually use them to evaluate their proposed solutions. This commit modifies the README to document the existence of benchmarks already as part of the Yampa package.
I'm about to merge this change. I will not merge the evaluation scripts, but I will simply put them here for now. To evaluate a specific commit, do the following: #!/bin/bash
DESTINATION=$1
mkdir -p $DESTINATION
git log -1 > $DESTINATION/git-log
git diff HEAD > $DESTINATION/git-diff
cabal v1-exec -- ghc-pkg list > $DESTINATION/packages
ghc --version > $DESTINATION/ghc-version
cabal --version > $DESTINATION/cabal-version
cabal v1-bench --benchmark-option=$DESTINATION To compare two evaluations, you can use the following Haskell program: {-# LANGUAGE OverloadedStrings #-}
import qualified Data.ByteString.Lazy as BL
import Data.Csv (FromNamedRecord (parseNamedRecord),
decodeByName, (.:))
import Data.List (transpose)
import qualified Data.Map as M
import qualified Data.Vector as V
import System.Environment (getArgs)
import Text.PrettyPrint.Boxes
main :: IO ()
main = do
-- Parse arguments
-- f1 is the CSV file produced with the new version
-- f2 is the CSV file produced with the old version
(f1:f2:v:_) <- getArgs
-- Parse results. The result of decoding the file is an Either, and the Right
-- case has a pair (Header, Vector Result). We use snd to keep the Vector.
csvData1 <- fmap snd <$> decodeByName <$> BL.readFile f1
csvData2 <- fmap snd <$> decodeByName <$> BL.readFile f2
-- Obtain the list of violations and correct comparisons by comparing the
-- data obtained from two CSV files using an auxiliary comparison function
let correct = fst <$> result
violations = snd <$> result
result = compareResults comparisonFunc <$> csvData1 <*> csvData2
-- v is the third argument when executing the program. It acts as
-- a threshold of how much better/worse the second version must be
-- than the first.
--
-- 1 means the old version must be slower or the same as the new one.
-- 2 means the new version must be at least twice as fast as the new one.
-- 0.9 means the new version can be up to 10% slower than the old one.
comparisonFunc x y = x * (read v) <= y
-- Print results
putStrLn "Correct"
printResults correct
putStrLn "Violations"
printResults violations
-- | Compare two CSV databases, and produce the correct results and the
-- violations.
compareResults :: (Double -> Double -> Bool) -- ^ Comparison function
-> V.Vector Result -- ^ Data from first file.
-> V.Vector Result -- ^ Data from second file.
-> (M.Map String Comparison, M.Map String Comparison)
compareResults prop rows1 rows2 =
M.partition (compareDurations prop) combinedCriterionData
where
combinedCriterionData = M.unionWith mergeComparisons map1 map2
-- Turn the result data (a vector) into a map
map1 = resultsToMapWith mkComparisonFst rows1
map2 = resultsToMapWith mkComparisonSnd rows2
printResults :: Either String (M.Map String Comparison) -> IO ()
printResults (Left s) = putStrLn $ "Error: " ++ s
printResults (Right m) = putStrLn table
where
table = render $ hsep 3 left $ map (vcat left . map text)
$ transpose cols
cols :: [[String]]
cols = ["Benchmark", "Alt 1", "Alt 2", "Factor (alt2 / alt1)"]
: [ [ benchmark, alt1, alt2, factor ]
| (benchmark, comp) <- M.assocs m
, let alt1 :: String
alt1 = maybe "(nothing)" show $ comparisonDuration1 comp
, let alt2 :: String
alt2 = maybe "(nothing)" show $ comparisonDuration2 comp
, let factor =
case (comparisonDuration1 comp, comparisonDuration2 comp) of
(Just v1, Just v2) -> show (v2/v1)
_ -> "N/A"
]
-- * Comparisons
-- | Comparison entry
data Comparison = Comparison
{ comparisonDuration1 :: Maybe Double
, comparisonDuration2 :: Maybe Double
}
deriving Show
-- | Constructor with 1st comparison value only.
mkComparisonFst :: Double -> Comparison
mkComparisonFst v = Comparison (Just v) Nothing
-- | Constructor with 2nd comparison value only.
mkComparisonSnd :: Double -> Comparison
mkComparisonSnd v = Comparison Nothing (Just v)
-- | Merge the first duration from one comparsion with the second duration from
-- another comparison.
mergeComparisons :: Comparison -> Comparison -> Comparison
mergeComparisons c1 c2 =
Comparison (comparisonDuration1 c1) (comparisonDuration2 c2)
-- | A comparison succceds if both values exist and the first is greater or
-- equal than the second.
compareDurations :: (Double -> Double -> Bool) -> Comparison -> Bool
compareDurations prop (Comparison (Just d1) (Just d2)) = prop d1 d2
compareDurations _ _ = False
-- * Criterion
-- | Dataype representing a row of results from Criterion.
data Result = Result
{ name :: !String
, mean :: !Double
, meanLB :: !Double
, meanUB :: !Double
, stddev :: !Double
, stddevLB :: !Double
, stddevUB :: !Double
}
-- | Instance to parse a result from a named CSV row.
instance FromNamedRecord Result where
parseNamedRecord r = Result <$> r .: "Name"
<*> r .: "Mean"
<*> r .: "MeanLB"
<*> r .: "MeanUB"
<*> r .: "Stddev"
<*> r .: "StddevLB"
<*> r .: "StddevUB"
-- | Build a map of comparisons from a vector of results read from the CSV
-- file. We use this auxiliary type so we can use Map union to merge results
-- from two files.
resultsToMapWith :: (Double -> Comparison)
-> V.Vector Result
-> M.Map String Comparison
resultsToMapWith f = vectorToMap . V.map (\r -> (name r, f (mean r)))
-- * Auxiliary
-- | Turn a vector into a map
vectorToMap :: Ord key
=> V.Vector (key, value)
-> M.Map key value
vectorToMap vec =
V.foldl (\myMap (key, value) -> M.insert key value myMap) M.empty vec |
One of the main reasons for Yampa to have changed very little, and for so many PRs to take so long to integrate, is trying to make sure that we do not seriously impact performance or break the API when we do that.
We need a series of benchmarks for Yampa that allow us to 1) measure performance, and 2) compare performance, both to prior implementations and, potentially, to other FRP implementations. Of course, if they are very different, that may be really hard and need to be addressed separately.
I'm creating this issue to document this line of work and have a way to track progress in this direction.
This is also relevant for: ivanperez-keera/dunai#233
The text was updated successfully, but these errors were encountered: